蘋果的人工智能專家暢談強(qiáng)化學(xué)習(xí)
本周二,,蘋果(Apple)人工智能研究部門的主任魯斯蘭·薩拉赫丁諾夫探討了這項(xiàng)技術(shù)的部分局限性,。不過,在這次《麻省理工科技評(píng)論》(MIT Technology Review)的會(huì)議上,,對(duì)于他所在的神秘公司會(huì)如何將人工智能應(yīng)用于Siri等產(chǎn)品,,他回避了討論。 去年10月加入蘋果的薩拉赫丁諾夫表示,,他對(duì)于人工智能中強(qiáng)化學(xué)習(xí)這個(gè)領(lǐng)域尤其有興趣,。利用這種方法,研究人員可以教電腦反復(fù)采取各種舉動(dòng),,找出最優(yōu)的解決方案,。例如,谷歌(Google)就用強(qiáng)化學(xué)習(xí)幫助其數(shù)據(jù)中心的計(jì)算機(jī)找到了最優(yōu)的冷卻和運(yùn)轉(zhuǎn)配置,,從而提高了能源的利用效率,。 薩拉赫丁諾夫還是卡內(nèi)基梅隆大學(xué)(Carnegie Mellon)的助理教授。他說,,這所大學(xué)最近在利用強(qiáng)化學(xué)習(xí),,訓(xùn)練計(jì)算機(jī)玩一款20世紀(jì)90年代的電子游戲《毀滅戰(zhàn)士》(Doom)。計(jì)算機(jī)很快就學(xué)會(huì)了準(zhǔn)確射擊外星人,,還發(fā)現(xiàn)回避動(dòng)作可以躲開敵軍的火力。然而,,這些專業(yè)的《毀滅戰(zhàn)士》計(jì)算機(jī)系統(tǒng)不太善于記憶迷宮場(chǎng)景等,,這導(dǎo)致它們無法規(guī)劃和設(shè)計(jì)策略。 薩拉赫丁諾夫的研究目標(biāo)之一,,就是開發(fā)能記住《毀滅戰(zhàn)士》虛擬迷宮及幾個(gè)特殊參照點(diǎn),,從而定位特定巨塔的人工智能軟件。這款軟件在游戲中會(huì)先查看火炬是紅色還是綠色,,因?yàn)榛鹁骖伾牟煌?,意味著需要定位的巨塔顏色不同?/p> 最后,軟件學(xué)會(huì)了通過迷宮,,抵達(dá)正確的巨塔,。如果它走錯(cuò)了,也會(huì)原路返回迷宮找到正確的那個(gè),。薩拉赫丁諾夫表示,,尤其值得注意的是,軟件每次看到巨塔,,都能回憶起開始看到的火炬顏色,。 然而,,他也表示,這種人工智能需要“長期的訓(xùn)練時(shí)間”,,還要求強(qiáng)大的運(yùn)算能力,,因此很難大批量生產(chǎn)。他說:“目前來看,,它還太脆弱了,。” 薩拉赫丁諾夫另一個(gè)想要探索的領(lǐng)域,,就是讓人工智能軟件更快地通過“少數(shù)案例和經(jīng)歷”進(jìn)行學(xué)習(xí),。盡管他沒有明說,不過他的想法應(yīng)該能幫助蘋果在“用更少時(shí)間做出更好產(chǎn)品”的競(jìng)爭(zhēng)中取得優(yōu)勢(shì),。 一些人工智能的專家和分析師認(rèn)為,,蘋果的人工智能技術(shù)比起谷歌或微軟(Microsoft)等對(duì)手要遜色一籌。因?yàn)楣居兄鼑?yán)格的用戶隱私條款,,限制了能夠用于訓(xùn)練計(jì)算機(jī)的數(shù)據(jù)量,。如果蘋果在訓(xùn)練計(jì)算機(jī)上使用了更少的數(shù)據(jù),公司或許會(huì)在滿足隱私要求的情況下,,用媲美競(jìng)爭(zhēng)對(duì)手的速度改進(jìn)軟件,。(財(cái)富中文網(wǎng)) 作者:Jonathan Vanian 譯者:嚴(yán)匡正 |
On Tuesday, Apple’s director of AI research, Ruslan Salakhutdinov, discussed some of those limitations. However, he steered clear during his talk at an MIT Technology Review conference of how his secretive company incorporates AI into its products like Siri. Salakhutdinov, who joined Apple in October, said he is particularly interested in a type of AI known as reinforcement learning, which researchers use to teach computers to repeatedly take different actions to figure out the best possible result. Google (goog, +0.17%), for example, used reinforcement learning to help its computers find the best possible cooling and operating configurations in its data centers, thus making then more energy efficient. Researchers at Carnegie Mellon, where Salakhutdinov is also an associate professor, recently used reinforcement learning to train computers to play the 1990's era video game Doom, Salakhutdinov explained. Computers learned to quickly and accurately shoot aliens while also discovering that ducking helps with avoiding enemy fire. However, these expert Doom computer systems are not very good at remembering things like the maze's layouts, which keeps them from planning and building strategies, he said. Part of Salakhutdinov’s research involves creating AI-powered software that memorizes the layouts of virtual mazes in Doom and points of references in order to locate specific towers. During the game, the software first spots what's either a red or green torch, with the color of the torch corresponding to the color of the tower it needs to locate. Eventually, the software learned to navigate the maze to reach the correct tower. When it discovered the wrong tower, the software backtracked through the maze to find the right one. What was especially noteworthy was that the software was able to recall the color of the torch each time it spotted a tower, he explained. However, Salakhutdinov said this type of AI software takes “a long time to train” and that it requires enormous amounts of computing power, which makes it difficult to build at large scale. “Right now it’s very brittle,” Salakhutdinov said. Another area Salakhutdinov wants to explore is teaching AI software to learn more quickly from “few examples and few experiences.” Although he did not mention it, his idea would benefit Apple in its race to create better products in less time. Some AI experts and analysts believe Apple's AI technologies are inferior to competitors like Google or Microsoft because o聽f the company's stricter user privacy rules, which limits the amount of data it can use to train its computers. If Apple used less data for computer training, it could perhaps satisfy its privacy requirements while still improving its software as quickly as rivals. |