亚色在线观看_亚洲人成a片高清在线观看不卡_亚洲中文无码亚洲人成频_免费在线黄片,69精品视频九九精品视频,美女大黄三级,人人干人人g,全新av网站每日更新播放,亚洲三及片,wwww无码视频,亚洲中文字幕无码一区在线

首頁(yè) 500強(qiáng) 活動(dòng) 榜單 商業(yè) 科技 商潮 專(zhuān)題 品牌中心
雜志訂閱

人工智能側(cè)重英語(yǔ),,使許多國(guó)家處于不利地位

David Meyer
2025-02-13

歐盟的一項(xiàng)新項(xiàng)目旨在為32種語(yǔ)言解決這一問(wèn)題

文本設(shè)置
小號(hào)
默認(rèn)
大號(hào)
Plus(0條)

圖片來(lái)源:Jakub Porzycki—NurPhoto/Getty Images

歐洲一項(xiàng)雄心勃勃的新人工智能項(xiàng)目已初具規(guī)模,,該項(xiàng)目旨在開(kāi)發(fā)支持該地區(qū)24種官方語(yǔ)言及更多語(yǔ)言的開(kāi)源人工智能模型,并力求遵守其繁雜的數(shù)字立法,。

OpenEuroLLM項(xiàng)目于本月初啟動(dòng),,預(yù)算僅為3740萬(wàn)歐元(約合3860萬(wàn)美元):與其他人工智能相關(guān)項(xiàng)目[如美國(guó)星際之門(mén)人工智能基礎(chǔ)設(shè)施項(xiàng)目(Stargate AI infrastructure project)首期投入1000億美元]相比,這一預(yù)算顯得微不足道,。盡管參與該項(xiàng)目的公司,,如德國(guó)的Aleph Alpha和芬蘭的Silo AI等,也投入了等值的研究人員時(shí)間,,但項(xiàng)目資金的主要來(lái)源仍是歐盟委員會(huì),。

歐盟資助的項(xiàng)目通常進(jìn)展緩慢,而該項(xiàng)目制定了為期三年的路線圖,,但該行業(yè)目前每月都在經(jīng)歷重大變革,。不過(guò),組織者和參與者向《財(cái)富》雜志表示,,有望在一年內(nèi)交付一個(gè)中間成果模型,,而且為此付出的努力是值得的。

說(shuō)方言

Aleph Alpha首席研究官亞瑟·賈迪迪(Yasser Jadidi)指出:“大多數(shù)享有全球知名度的模型開(kāi)發(fā)工作都側(cè)重于英語(yǔ),。這是由于絕大多數(shù)可獲取且可訪問(wèn)的互聯(lián)網(wǎng)文本數(shù)據(jù)都是英文的,,這使得其他語(yǔ)言處于不利地位?!?/p>

對(duì)于瑞典或土耳其(OpenEuroLLM項(xiàng)目還針對(duì)已申請(qǐng)加入歐盟的八個(gè)國(guó)家的語(yǔ)言,,因此該項(xiàng)目總共涵蓋32種語(yǔ)言)等地的民眾而言,缺乏能夠理解其語(yǔ)言復(fù)雜性的人工智能模型無(wú)疑構(gòu)成了一個(gè)嚴(yán)峻的挑戰(zhàn),。首要問(wèn)題在于,,這加大了當(dāng)?shù)仄髽I(yè)和公共機(jī)構(gòu)采納該技術(shù)并開(kāi)始提供新服務(wù)的難度。

歐洲最大的私人人工智能實(shí)驗(yàn)室Silo AI(該實(shí)驗(yàn)室去年被AMD收購(gòu),,目前正在參與OpenEuroLLM項(xiàng)目)的首席執(zhí)行官彼得·薩林(Peter Sarlin)表示:"這首先是一個(gè)商業(yè)問(wèn)題,。無(wú)論是阿爾巴尼亞語(yǔ)、芬蘭語(yǔ),、瑞典語(yǔ)還是其他語(yǔ)言,,是否存在能夠在特定的低資源語(yǔ)言中表現(xiàn)出色的模型,從而使該地區(qū)的公司能夠最終以此為基礎(chǔ)構(gòu)建服務(wù),?”

賈迪迪表示,,這一問(wèn)題還對(duì)本地語(yǔ)境中人工智能模型的準(zhǔn)確性和安全性的評(píng)估工作產(chǎn)生了影響。事實(shí)上,,Aleph Alpha在該項(xiàng)目中的主要作用是提供人工智能模型評(píng)估基準(zhǔn)(而這套基準(zhǔn)并非簡(jiǎn)單地從英語(yǔ)版本進(jìn)行機(jī)器翻譯得來(lái),,因?yàn)榇蠖鄶?shù)現(xiàn)有的人工智能模型評(píng)估基準(zhǔn)都沿用了這一做法,。)

OpenEuroLLM項(xiàng)目的資金可能相對(duì)較少,,但它并非從零開(kāi)始。

該項(xiàng)目的大多數(shù)參與者此前已參與過(guò)一個(gè)名為高性能語(yǔ)言技術(shù)(HPLT)的獨(dú)立項(xiàng)目,,該項(xiàng)目于兩年前啟動(dòng),,預(yù)算僅為600萬(wàn)歐元,。起初,,高性能語(yǔ)言技術(shù)項(xiàng)目的目標(biāo)是交付人工智能模型,,但隨后OpenAI的ChatGPT改變了人工智能領(lǐng)域的格局,于是組織者轉(zhuǎn)向創(chuàng)建一個(gè)可用于訓(xùn)練多語(yǔ)言模型的高質(zhì)量數(shù)據(jù)集,。目前,,高性能語(yǔ)言技術(shù)數(shù)據(jù)集正處于“清理”錯(cuò)誤信息階段,將成為OpenEuroLLM工作的基礎(chǔ),。

OpenEuroLLM將創(chuàng)建一個(gè)基于所有歐洲語(yǔ)言數(shù)據(jù)集訓(xùn)練的基礎(chǔ)模型,。一旦該基礎(chǔ)模型完成開(kāi)發(fā),,另一個(gè)由歐盟資助的名為L(zhǎng)LMs4EU的項(xiàng)目將對(duì)其進(jìn)行微調(diào)以用于各種應(yīng)用程序,。除了提供資金支持外,歐盟還為所有這些項(xiàng)目提供了算力資源,。

遵守規(guī)則

對(duì)于人工智能公司而言,,在歐洲開(kāi)展業(yè)務(wù)并非易事,。除了逐步生效的《人工智能法案》(AI Act)對(duì)模型提供商及其客戶(hù)施加的一系列報(bào)告責(zé)任之外,,還要考慮版權(quán)法和競(jìng)爭(zhēng)法,以及《通用數(shù)據(jù)保護(hù)條例》(GDPR,,該條例對(duì)人工智能公司可使用的個(gè)人數(shù)據(jù)設(shè)定了嚴(yán)格限制),。

這些法律對(duì)歐洲人工智能的發(fā)展產(chǎn)生了實(shí)質(zhì)性影響,Meta因《通用數(shù)據(jù)保護(hù)條例》的限制而推遲了Meta AI的推出,,蘋(píng)果(Apple)也因未指明的反壟斷問(wèn)題而推遲了Apple Intelligence的部署,。(Apple Intelligence將于4月以有限的形式在歐盟地區(qū)的iPhone上推出,,而Meta已開(kāi)始向歐洲智能眼鏡佩戴者提供部分Meta AI功能,。)

就OpenEuroLLM的組織者而言,,這些法律挑戰(zhàn)是可以克服的。與薩林共同領(lǐng)導(dǎo)該項(xiàng)目的捷克查理大學(xué)的揚(yáng)·哈吉奇(Jan Haji?)說(shuō):"我們相信,,我們能夠遵守所有這些法律規(guī)定,?!?/p>

哈吉奇表示,參與者在開(kāi)發(fā)高性能語(yǔ)言技術(shù)數(shù)據(jù)集時(shí)已經(jīng)解決了版權(quán)問(wèn)題和大部分隱私問(wèn)題,?!啊锻ㄓ脭?shù)據(jù)保護(hù)條例》可能構(gòu)成一定的挑戰(zhàn),,但我們正試圖通過(guò)數(shù)據(jù)假名化來(lái)解決這一問(wèn)題,也就是說(shuō),,如果遇到人名,,會(huì)將其進(jìn)行刪除處理?!彼f(shuō),,同時(shí)承認(rèn)這一過(guò)程中必要的自動(dòng)化可能無(wú)法保證達(dá)到百分之百的成功率。

哈吉奇表示:“我們的宗旨是確保所有行動(dòng)都不會(huì)與歐洲法規(guī)產(chǎn)生任何沖突,?!彼€補(bǔ)充說(shuō),這可能會(huì)吸引那些意圖開(kāi)拓歐盟市場(chǎng)的公司,。對(duì)于那些在《人工智能法案》框架下需要向歐盟當(dāng)局提交大量報(bào)告的高風(fēng)險(xiǎn)用例而言,,開(kāi)源方法將因其所提供的透明度而變得至關(guān)重要。

OpenEuroLLM項(xiàng)目有20個(gè)參與者,,包括企業(yè),、研究機(jī)構(gòu)和芬蘭Lumi等高性能計(jì)算集群。這樣的組合可能被視為一種負(fù)擔(dān),,甚至可能引發(fā)優(yōu)先級(jí)上的分歧,,但Aleph Alpha的賈迪迪認(rèn)為,,開(kāi)源項(xiàng)目通常涉及眾多的參與者,,但這并不意味著項(xiàng)目會(huì)因此受到拖累。

他說(shuō):“我們完全有機(jī)會(huì)確保眾多的貢獻(xiàn)者不是阻礙,,反而會(huì)帶來(lái)機(jī)遇,?!保ㄘ?cái)富中文網(wǎng))

譯者:中慧言-王芳

歐洲一項(xiàng)雄心勃勃的新人工智能項(xiàng)目已初具規(guī)模,,該項(xiàng)目旨在開(kāi)發(fā)支持該地區(qū)24種官方語(yǔ)言及更多語(yǔ)言的開(kāi)源人工智能模型,并力求遵守其繁雜的數(shù)字立法,。

OpenEuroLLM項(xiàng)目于本月初啟動(dòng),,預(yù)算僅為3740萬(wàn)歐元(約合3860萬(wàn)美元):與其他人工智能相關(guān)項(xiàng)目[如美國(guó)星際之門(mén)人工智能基礎(chǔ)設(shè)施項(xiàng)目(Stargate AI infrastructure project)首期投入1000億美元]相比,這一預(yù)算顯得微不足道,。盡管參與該項(xiàng)目的公司,,如德國(guó)的Aleph Alpha和芬蘭的Silo AI等,,也投入了等值的研究人員時(shí)間,,但項(xiàng)目資金的主要來(lái)源仍是歐盟委員會(huì),。

歐盟資助的項(xiàng)目通常進(jìn)展緩慢,而該項(xiàng)目制定了為期三年的路線圖,,但該行業(yè)目前每月都在經(jīng)歷重大變革。不過(guò),,組織者和參與者向《財(cái)富》雜志表示,有望在一年內(nèi)交付一個(gè)中間成果模型,,而且為此付出的努力是值得的,。

說(shuō)方言

Aleph Alpha首席研究官亞瑟·賈迪迪(Yasser Jadidi)指出:“大多數(shù)享有全球知名度的模型開(kāi)發(fā)工作都側(cè)重于英語(yǔ),。這是由于絕大多數(shù)可獲取且可訪問(wèn)的互聯(lián)網(wǎng)文本數(shù)據(jù)都是英文的,,這使得其他語(yǔ)言處于不利地位?!?/p>

對(duì)于瑞典或土耳其(OpenEuroLLM項(xiàng)目還針對(duì)已申請(qǐng)加入歐盟的八個(gè)國(guó)家的語(yǔ)言,,因此該項(xiàng)目總共涵蓋32種語(yǔ)言)等地的民眾而言,缺乏能夠理解其語(yǔ)言復(fù)雜性的人工智能模型無(wú)疑構(gòu)成了一個(gè)嚴(yán)峻的挑戰(zhàn),。首要問(wèn)題在于,,這加大了當(dāng)?shù)仄髽I(yè)和公共機(jī)構(gòu)采納該技術(shù)并開(kāi)始提供新服務(wù)的難度。

歐洲最大的私人人工智能實(shí)驗(yàn)室Silo AI(該實(shí)驗(yàn)室去年被AMD收購(gòu),,目前正在參與OpenEuroLLM項(xiàng)目)的首席執(zhí)行官彼得·薩林(Peter Sarlin)表示:"這首先是一個(gè)商業(yè)問(wèn)題,。無(wú)論是阿爾巴尼亞語(yǔ)、芬蘭語(yǔ),、瑞典語(yǔ)還是其他語(yǔ)言,,是否存在能夠在特定的低資源語(yǔ)言中表現(xiàn)出色的模型,從而使該地區(qū)的公司能夠最終以此為基礎(chǔ)構(gòu)建服務(wù),?”

賈迪迪表示,,這一問(wèn)題還對(duì)本地語(yǔ)境中人工智能模型的準(zhǔn)確性和安全性的評(píng)估工作產(chǎn)生了影響。事實(shí)上,,Aleph Alpha在該項(xiàng)目中的主要作用是提供人工智能模型評(píng)估基準(zhǔn)(而這套基準(zhǔn)并非簡(jiǎn)單地從英語(yǔ)版本進(jìn)行機(jī)器翻譯得來(lái),,因?yàn)榇蠖鄶?shù)現(xiàn)有的人工智能模型評(píng)估基準(zhǔn)都沿用了這一做法。)

OpenEuroLLM項(xiàng)目的資金可能相對(duì)較少,,但它并非從零開(kāi)始,。

該項(xiàng)目的大多數(shù)參與者此前已參與過(guò)一個(gè)名為高性能語(yǔ)言技術(shù)(HPLT)的獨(dú)立項(xiàng)目,該項(xiàng)目于兩年前啟動(dòng),,預(yù)算僅為600萬(wàn)歐元,。起初,,高性能語(yǔ)言技術(shù)項(xiàng)目的目標(biāo)是交付人工智能模型,但隨后OpenAI的ChatGPT改變了人工智能領(lǐng)域的格局,,于是組織者轉(zhuǎn)向創(chuàng)建一個(gè)可用于訓(xùn)練多語(yǔ)言模型的高質(zhì)量數(shù)據(jù)集,。目前,高性能語(yǔ)言技術(shù)數(shù)據(jù)集正處于“清理”錯(cuò)誤信息階段,,將成為OpenEuroLLM工作的基礎(chǔ),。

OpenEuroLLM將創(chuàng)建一個(gè)基于所有歐洲語(yǔ)言數(shù)據(jù)集訓(xùn)練的基礎(chǔ)模型。一旦該基礎(chǔ)模型完成開(kāi)發(fā),,另一個(gè)由歐盟資助的名為L(zhǎng)LMs4EU的項(xiàng)目將對(duì)其進(jìn)行微調(diào)以用于各種應(yīng)用程序,。除了提供資金支持外,歐盟還為所有這些項(xiàng)目提供了算力資源,。

遵守規(guī)則

對(duì)于人工智能公司而言,,在歐洲開(kāi)展業(yè)務(wù)并非易事。除了逐步生效的《人工智能法案》(AI Act)對(duì)模型提供商及其客戶(hù)施加的一系列報(bào)告責(zé)任之外,,還要考慮版權(quán)法和競(jìng)爭(zhēng)法,,以及《通用數(shù)據(jù)保護(hù)條例》(GDPR,該條例對(duì)人工智能公司可使用的個(gè)人數(shù)據(jù)設(shè)定了嚴(yán)格限制),。

這些法律對(duì)歐洲人工智能的發(fā)展產(chǎn)生了實(shí)質(zhì)性影響,,Meta因《通用數(shù)據(jù)保護(hù)條例》的限制而推遲了Meta AI的推出,蘋(píng)果(Apple)也因未指明的反壟斷問(wèn)題而推遲了Apple Intelligence的部署,。(Apple Intelligence將于4月以有限的形式在歐盟地區(qū)的iPhone上推出,而Meta已開(kāi)始向歐洲智能眼鏡佩戴者提供部分Meta AI功能,。)

就OpenEuroLLM的組織者而言,,這些法律挑戰(zhàn)是可以克服的。與薩林共同領(lǐng)導(dǎo)該項(xiàng)目的捷克查理大學(xué)的揚(yáng)·哈吉奇(Jan Haji?)說(shuō):"我們相信,,我們能夠遵守所有這些法律規(guī)定,。”

哈吉奇表示,,參與者在開(kāi)發(fā)高性能語(yǔ)言技術(shù)數(shù)據(jù)集時(shí)已經(jīng)解決了版權(quán)問(wèn)題和大部分隱私問(wèn)題,。“《通用數(shù)據(jù)保護(hù)條例》可能構(gòu)成一定的挑戰(zhàn),,但我們正試圖通過(guò)數(shù)據(jù)假名化來(lái)解決這一問(wèn)題,,也就是說(shuō),如果遇到人名,,會(huì)將其進(jìn)行刪除處理,。”他說(shuō),,同時(shí)承認(rèn)這一過(guò)程中必要的自動(dòng)化可能無(wú)法保證達(dá)到百分之百的成功率,。

哈吉奇表示:“我們的宗旨是確保所有行動(dòng)都不會(huì)與歐洲法規(guī)產(chǎn)生任何沖突,。”他還補(bǔ)充說(shuō),,這可能會(huì)吸引那些意圖開(kāi)拓歐盟市場(chǎng)的公司,。對(duì)于那些在《人工智能法案》框架下需要向歐盟當(dāng)局提交大量報(bào)告的高風(fēng)險(xiǎn)用例而言,開(kāi)源方法將因其所提供的透明度而變得至關(guān)重要,。

OpenEuroLLM項(xiàng)目有20個(gè)參與者,,包括企業(yè)、研究機(jī)構(gòu)和芬蘭Lumi等高性能計(jì)算集群,。這樣的組合可能被視為一種負(fù)擔(dān),,甚至可能引發(fā)優(yōu)先級(jí)上的分歧,但Aleph Alpha的賈迪迪認(rèn)為,,開(kāi)源項(xiàng)目通常涉及眾多的參與者,,但這并不意味著項(xiàng)目會(huì)因此受到拖累。

他說(shuō):“我們完全有機(jī)會(huì)確保眾多的貢獻(xiàn)者不是阻礙,,反而會(huì)帶來(lái)機(jī)遇,。”(財(cái)富中文網(wǎng))

譯者:中慧言-王芳

An ambitious new AI project has begun to take shape in Europe, with the aim of developing open-source AI models that support the region’s 24 official languages and more—while also complying as much as possible with its thicket of digital legislation.

The OpenEuroLLM project, which commenced work at the start of the month, has a budget of just €37.4 million ($38.6 million): a pittance compared with the sums being invested in other AI-related projects like the $100 billion first tranche of the U.S.’s Stargate AI infrastructure project. Although participating companies such as Germany’s Aleph Alpha and Finland’s Silo AI are also contributing their researchers’ time to an equivalent value, the bulk of the funding comes from the European Commission.

EU-funded projects don’t tend to move fast, and this one has a three-year road map in a sector that’s currently undergoing significant evolution each month. But organizers and participants tell Fortune that it could be possible to deliver an intermediate model within a year—and the effort will be worth it.

Speaking in tongues

“Most model development efforts that have worldwide visibility focus on the English language,” said Yasser Jadidi, chief research officer at Aleph Alpha. “It’s a consequence of most of the internet text data that is available and accessible being in English, and it puts other languages at a disadvantage.”

For people in places like Sweden or Turkey (the OpenEuroLLM project is also targeting the tongues of eight countries that have applied for EU membership, so that the project encompasses a total of 32 languages) the lack of AI models that understand the intricacies of their languages can be a serious problem. For a start, it makes it harder for local companies and public authorities to adopt the technology and start providing new services.

“It’s first and foremost a commercial question,” said Peter Sarlin, the CEO of Silo AI, Europe’s largest private AI lab, which was acquired by AMD last year and is participating in OpenEuroLLM. “Are there models that are performant in that specific low-resource language, be it Albanian or Finnish or Swedish or some other, that allows companies within that region to eventually build services on top?”

The issue also has consequences for evaluating the accuracy and safety of AI models in the local context, Jadidi said. Indeed, Aleph Alpha’s role in the project is chiefly to provide AI-model evaluation benchmarks that aren’t simply machine-translated from English, as most are.

The OpenEuroLLM project may have relatively meager funding, but it isn’t starting from scratch.

Most of its participants have already been involved in a separate scheme called High Performance Language Technologies (HPLT), which started two years ago with a budget of just €6 million. The original proposal was for HPLT to deliver AI models, but then OpenAI’s ChatGPT changed the AI landscape and the organizers pivoted to creating a high-quality dataset that can be used to train multilingual models. The HPLT dataset is currently being “cleaned” of errors, and it will form the basis of OpenEuroLLM’s work.

OpenEuroLLM will create a base model trained on a dataset of all the European languages. Once that’s done, yet another EU-funded project, called LLMs4EU, will fine-tune it for various applications. Apart from cash, the EU is also providing computational resources to all these schemes.

Sticking to the rules

Europe is not the easiest place for AI companies to do business. Quite apart from the AI Act that is gradually coming into force, placing all sorts of reporting responsibilities on model providers and their customers, there’s also copyright and competition law to consider—and the General Data Protection Regulation (GDPR), which places strict limits on the personal data that AI companies can use.

These laws have had real effects on AI’s European progress, with Meta delaying the rollout of Meta AI because of GDPR limits, and Apple also delaying the deployment of Apple Intelligence because of unspecified antitrust issues. (Apple Intelligence will come to EU iPhones in limited form in April, while Meta has started offering some Meta AI features to European wearers of its smart glasses.)

As far as OpenEuroLLM’s organizers are concerned, these laws are manageable. “We believe we can live with all of them,” said Jan Haji? of Charles University in Czechia, who is co-leading the project with Sarlin.

Haji? said the participants had already dealt with the copyright and most privacy issues when developing the HPLT dataset. “The GDPR could be a problem, but that’s something we are trying to get around with pseudonymizing the data, meaning that if we encounter people’s names it gets deleted,” he said, while acknowledging that the necessary automation in this process may not have a 100% success rate.

“Our goal is to do things in such a way that they will not clash with the European regulation in any way,” Haji? said, adding that this could be a draw for companies wanting to target EU markets. For high-risk use cases that will require a lot of reporting to the EU authorities under the AI Act, the open-source approach will be essential for the transparency it allows, he argued.

The OpenEuroLLM project has 20 participants including companies, research institutions, and high-performance computing clusters like Finland’s Lumi. This setup could be seen as a liability with the potential for diverging priorities, but Aleph Alpha’s Jadidi argued that open-source projects often include a wide array of participants without being dragged down.

“We have all the opportunity to ensure that a high amount of contributors is not a hindrance but an opportunity,” he said.

財(cái)富中文網(wǎng)所刊載內(nèi)容之知識(shí)產(chǎn)權(quán)為財(cái)富媒體知識(shí)產(chǎn)權(quán)有限公司及/或相關(guān)權(quán)利人專(zhuān)屬所有或持有,。未經(jīng)許可,,禁止進(jìn)行轉(zhuǎn)載、摘編,、復(fù)制及建立鏡像等任何使用,。
0條Plus
精彩評(píng)論
評(píng)論

撰寫(xiě)或查看更多評(píng)論

請(qǐng)打開(kāi)財(cái)富Plus APP

前往打開(kāi)
熱讀文章