0條Plus

《財富》專訪百度首席科學(xué)家：計算機不會統(tǒng)治世界

Derrick Harris 2016年02月28日

從語音控制到圖像搜索,，再到無人駕駛汽車，人工智能最近很火爆,，也的確值得關(guān)注,。但感知與現(xiàn)實之間往往存在差異。百度首席科學(xué)家吳恩達接受《財富》專訪,，解釋了為什么人工智能變得如此熱門,，公司如何利用人工智能賺錢，以及人們對人工智能大災(zāi)變的擔憂為什么是不切實際的,。

百度首席科學(xué)家吳恩達領(lǐng)導(dǎo)著該公司人工智能的研發(fā)工作,。

如今,，人工智能似乎無處不在，它被用于智能手機上的語音控制和圖像搜索,，并提供了無人駕駛汽車和機器人的大腦,。在這一領(lǐng)域，最受關(guān)注的是美國谷歌,、Facebook和微軟公司,，但中國的搜索巨頭百度同樣是一支重要的力量。

百度人工智能的研發(fā)公司由該公司首席科學(xué)家吳恩達領(lǐng)銜,，他之前在斯坦福大學(xué)教授機器學(xué)習(xí),，參與創(chuàng)建在線課程平臺Coursera和谷歌大腦項目。2012年,，谷歌大腦項目證明,，計算機可以通過自我學(xué)習(xí)識別出YouTube視頻中的貓，由此激發(fā)了大眾對深度學(xué)習(xí)這一人工智能領(lǐng)域的興趣,。吳恩達在加州森尼維爾創(chuàng)建了百度人工智能實驗室,。在此次采訪中，他解釋了為什么人工智能現(xiàn)在變得如此熱門,，公司如何利用人工智能賺錢,，以及為什么人們對未來的人工智能大災(zāi)變的擔憂是不實際的。

以下是本次獨家專訪的主要內(nèi)容：

《財富》雜志：你如何定義人工智能,，至少就商業(yè)上可行的人工智能應(yīng)用而言,？

吳恩達：過去幾年，我們看到,，計算機在吸收數(shù)據(jù)進行預(yù)測方面做得越來越好,。其中包括預(yù)測用戶最有可能點擊哪類廣告,，識別圖片中的人，預(yù)測網(wǎng)頁中哪些內(nèi)容與你的搜索關(guān)鍵詞最相關(guān)——這樣的例子不勝枚舉,。這些應(yīng)用帶來了更好的用戶體驗,，而且為一些公司帶來了更多收入。

《財富》：是不是可以說,，主流人工智能更多的是識別數(shù)據(jù)模式,，而不是創(chuàng)建能像人一樣思考的計算機？

吳恩達：盡管人工智能非?；鸨?，但我認為，它們的發(fā)展程度遠遠低于人們的想象,。目前的人工智能所創(chuàng)造的價值均來自一類技術(shù)——監(jiān)督學(xué)習(xí),。所謂監(jiān)督學(xué)習(xí)是指根據(jù)系統(tǒng)已經(jīng)看到的其他輸入示例，預(yù)測結(jié)果或?qū)κ录M行分類,，例如“給出一張圖片,，找出其中的人”?；蛘摺敖o出一個網(wǎng)頁,，預(yù)測用戶是否會點擊這個網(wǎng)頁?！被蛘摺敖o出一封電子郵件,，確定這是不是垃圾郵件?！?

語音識別是另外一個例子,，其中輸入的是音頻片段,，輸出的是說話內(nèi)容的文本,。

《財富》：蘋果新發(fā)布的Siri功能，使語音識別技術(shù)成為媒體關(guān)注的焦點,。未來可以采取哪些措施使助手類應(yīng)用變得更加有用,？

吳恩達：我們的期望是，努力使與計算機交談變得像與真人交談一樣自然,。這是一個遙遠的目標,，短期內(nèi)不會實現(xiàn)，但一旦我們能夠?qū)崿F(xiàn)這個目標,，就會有更多用戶使用它,。今天，使用語音功能的主要是科技發(fā)燒友,。大多數(shù)人在與計算機互動的過程中并不會使用語音,。

與一臺機器交談的感覺,，和與真人交流的感覺仍有明顯差異：你只能說某些事情，你不能打斷計算機,。有時候,，等待機器做出回應(yīng)需要較長的時間。有時候你說出某些內(nèi)容,，機器無法理解,。舉一個典型的例子：比如我對著一臺電腦說：“請呼叫卡洛兒555-1000……不，等一下,，是1005,，”計算機能夠準確理解這些話，并完成正確的操作嗎,？

《財富》：幾年前,，似乎很少有面向消費者的人工智能，但如今,，語音識別和能夠識別圖片的算法等技術(shù)似乎變得非常普遍,，這期間到底發(fā)生了哪些變化？

吳恩達：計算能力的提高和數(shù)據(jù)的增多,，推動機器學(xué)習(xí)領(lǐng)域取得了很大的進步,，盡管這種觀點在學(xué)術(shù)界并不受歡迎。以造火箭來打個比方：你需要一臺巨大的火箭引擎,，你還要有足夠的燃料,。如果你的火箭引擎太小，卻有大量的燃料,，你的火箭可能無法起飛,。如果你有一臺巨大的火箭引擎但燃料較少，你可能無法讓火箭進入軌道,。

只有一臺巨大的引擎和足夠的燃料,，才能讓火箭到達有趣的地方。在這個比喻中,，火箭引擎便是大型計算機——在百度,，也就是我們正在建造的超級計算機——而火箭燃料便是我們擁有的大量數(shù)據(jù)。

過去十年間,，數(shù)據(jù)的積累或者說火箭燃料的增加,，超出了我們建造火箭引擎吸收這些燃料的能力。但現(xiàn)在,，我們有能力增大我們的火箭引擎,，甚至已經(jīng)超越了提供火箭燃料的能力。你必須努力增大同步提高這兩方面的能力,。

似乎每一次將深度學(xué)習(xí)應(yīng)用到一項任務(wù)當中,，都會產(chǎn)生最佳的結(jié)果,。我們能否將其應(yīng)用于公司的銷售數(shù)據(jù)，從而比傳統(tǒng)的企業(yè)軟件或流行的“大數(shù)據(jù)”工具更快生成有意義的見解,？

深度學(xué)習(xí)所面臨的一個重要限制是,，其創(chuàng)造的幾乎所有價值都在輸入-輸出映射當中。如果在企業(yè)數(shù)據(jù)中,，X代表亞馬遜的一個用戶賬號,，Y代表“他們是否曾進行購物？”而且你有大量X-Y配對的數(shù)據(jù),，那么你就可以采用深度學(xué)習(xí),。但我想說的是，在自行檢索數(shù)據(jù)和發(fā)現(xiàn)價值方面,，這類算法仍處在起步階段,。

這也是為什么我認為人工智能將催生殺手機器人和超級智能屬于過分炒作。這種X-Y類映射是一種非常狹隘的學(xué)習(xí)方式,。人類的學(xué)習(xí)方式要更加豐富,。用術(shù)語來說，這種方式叫監(jiān)督學(xué)習(xí),，我認為到目前為止,，我們還沒有找到其他學(xué)習(xí)類型的正確思路。

根據(jù)數(shù)據(jù)去探索世界的無監(jiān)督學(xué)習(xí)便是其他學(xué)習(xí)類型之一,。人類在這方面似乎很有天分,。計算機雖然有令人不可思議的基礎(chǔ)算法，可以進行一定程度的無監(jiān)督學(xué)習(xí),，但遠遠達不到人腦的水平,。

Artificial intelligence is seemingly everywhere today, doing everything from powering voice controls and image search on our smartphones to supplying the brains for autonomous cars and robots. And while research on the topic from Google GOOG -0.05% , Facebook FB 1.16% and Microsoft MSFT -0.07% gets most of the attention in the United States, Chinese search giant Baidu BIDU -2.52% is also a major player in the space.

Its efforts are led by Andrew Ng, the company’s chief scientist who previously taught machine learning at Stanford, co-founded Coursera and helped create the Google Brain project. In 2012, Google Brain kicked off mainstream interest in a field of AI known as deep learning by showing how computers can teach themselves to recognize cats in YouTube videos. In this interview, Ng, who works out of Baidu’s artificial intelligence lab in Sunnyvale, Calif., explains why artificial intelligence is so hot right now, how companies are using it to make real money, and why concerns over an impending AI apocalypse are probably overblown.

The following was edited for length and clarity:

Fortune: How would you define artificial intelligence, as least as it applies to commercially viable approaches?

Andrew Ng: What we’re seeing in the last several years is that computers are getting much better at soaking up data to make predictions. These include predicting what ad a user is most likely to click on, recognizing people in pictures, predicting what’s the web page most relevant to your search query — and hundreds of other of examples like that. Many of these are making digital experiences much better for users and, in some cases, increasing the bottom line for companies.

Is it fair to say then that mainstream AI is more about recognizing patterns in data than it is about building computers that think like humans?

Despite all the hype, I think they are much further off than some people think. Almost all the economic value created by AI today is with one type of technology, called supervised learning. What that means is learning to predict outcomes or classify things based on other example inputs the system has already seen, such as “Given a picture, find the people in it.” Or, “Given a web page, find whether user is going to click on this web page.” Or, “Given an email, determine if this a piece of spam or not.”

Speech recognition is another example of this, where the input is an audio clip and the output is a transcription of what was said.

Speech recognition has been in the news lately because of some new Apple Siri features. What are the next steps to make assistant-type apps more useful?

The vision we’re pursuing is trying to make talking to a computer feel as natural as talking to a person. That’s a distant goal, we won’t get there anytime soon, but as we reach that point, more users will use it. Today, speech is largely used by technology enthusiasts. Most people around the world are not using speech for their interactions with computers.

Talking to a machine still feels very different from talking to a person: you’re allowed to say certain things, you can’t interrupt a computer. Sometimes it takes a little longer to respond. Sometimes you say something and it’s very confused. Here’s one specific example: If I’m talking to a computer and I say, “Please call Carol at 555-1000 … no, wait, 1005,” can a computer interpret that correctly and take the right action?

What happened in the past few years to take us from seemingly little consumer-facing AI to today, where things like speech recognition and algorithms that can understand photos seem commonplace?

A lot of the progress in machine learning—and this is an unpopular opinion in academia—is driven by an increase in both computing power and data. An analogy is to building a space rocket: You need a huge rocket engine, and you need a lot of fuel. If you have a smaller rocket engine and a lot of fuel, your rocket’s probably not going to get off the ground. If you have a huge rocket engine but a tiny amount of fuel, you probably won’t make it to orbit.

It’s only with a huge engine and a lot of fuel that you can go interesting places. The analogy is that the rocket engines are the large computers—in Baidu’s case, supercomputers we can now build—and the rocket fuel is the huge amounts of data we now have.

Over the last decade, the rise of data, or the rise of rocket fuel, got a little ahead of our ability to build rocket engines to absorb that fuel. But now our ability to scale up our rocket engines has been catching up, and in some cases surpassing our ability to supply the rocket fuel. You have to work hard to scale them at the same time.

It seems like every time deep learning is applied to a task, it produces the best-ever results for that task. Could we apply this to, say, corporate sales data and identify meaningful insights faster than with traditional enterprise software or popular “big data” tools?

The big limitation of deep learning is that almost all the value it’s creating is in these input-output mappings. If you have corporate data where X is maybe a user account in Amazon, and Y is “Did they buy something?” and you have a lot of data with X-Y pairs, then you could do it. But in terms of going through data and discovering things by itself, that type of algorithm is very much in its infancy, I would say.

This is also one of the reasons why the AI evil killer robots and super-intelligence hype is overblown. This X-Y type of mapping is such a narrow form of learning. Humans learn in so many more ways than this. The technical term for this is supervised learning, and I think we just haven’t figured out the right ideas yet for the other types of learning.

Unsupervised learning is one of those other types, where you just look at data and discover stuff about the world. Humans seem to be amazing at that. Computers have incredibly rudimentary algorithms that try to do some of that, but are clearly nowhere near what any human brain can do.

在通過Skype接受采訪期間，吳恩達解釋了什么是X-Y配對,。

《財富》：谷歌和Facebook在美國獲得了極大的關(guān)注,，請告訴我們百度正在進行哪些由人工智能驅(qū)動的工作？

吳恩達：百度之前進行的一項工作是創(chuàng)建內(nèi)部的深度學(xué)習(xí)平臺,。我們所作的是讓全公司的工程師,，包括非人工智能研究人員，以各種創(chuàng)造性的方式使用深度學(xué)習(xí)——許多方式是我和其他人工智能研究人員不可能想到的,。以深度學(xué)習(xí)驅(qū)動的創(chuàng)造性產(chǎn)品有很多，不僅僅限于我們的網(wǎng)頁搜索,、圖片搜索和廣告等核心業(yè)務(wù),。

比如：我們的計算機安全產(chǎn)品使用深度學(xué)習(xí)來識別威脅。我本來不可能想到這一點,，也不可能知道如何實現(xiàn),。我們使用深度學(xué)習(xí)來嘗試提前預(yù)測硬盤會在哪一天出現(xiàn)故障,，而這提高了我們的數(shù)據(jù)中心的可靠性，降低了成本,。

《財富》：百度也創(chuàng)造出一項與谷歌眼鏡類似的技術(shù),，一個數(shù)字助手，甚至還有一款智能自行車,。這些產(chǎn)品有市場嗎,？或者目前只是有趣的實驗品？

吳恩達：我認為它們目前仍處在研究探索階段,。不過根據(jù)社區(qū)反饋,，這些產(chǎn)品肯定會有需求，比如智能自行車和可穿戴攝像機,。

實際上,，我們之前在中國演示了一款名叫Dulife的新產(chǎn)品，這款產(chǎn)品采用計算機視覺和自然語言處理,，告訴盲人前方有什么,。例如，在中國,，多種不同面值的鈔票尺寸相同,，盲人必須通過觸摸來確定它們的區(qū)別。但一張鈔票在流通一段時間之后,，觸摸部分會被磨損,，盲人便很難確定鈔票的面值。在這個應(yīng)用案例中,，簡單的計算機視覺可以告訴你,，你手中的鈔票是20元還是50元。這是盲人迫切需要的一項應(yīng)用,。

《財富》：作為百度的主要市場,，在移動或可穿戴設(shè)備方面，中國市場與美國市場或其他市場有何區(qū)別,？

吳恩達：中國市場迥然不同,。差異之一是，中國目前最大,、最熱門的科技潮流是O2O或線上線下電子商務(wù),。

O2O的概念是指，利用移動設(shè)備連接到周圍的實體服務(wù),，比如洗車,、送餐、尋找當?shù)氐恼劭垭娪啊ふ颐兰椎昊蚱刚堃晃粡N師到家中為你烹飪美食等,。美國也有這樣的服務(wù),，但我想中國的人口密度已經(jīng)推動O2O迅速崛起。

此外,，許多中國用戶的第一臺計算設(shè)備是智能手機,。如果你的第一臺計算設(shè)備是手機，你就會直接學(xué)習(xí)使用手機最有效的方式,，不需要完成從電腦到手機的過渡,。

《財富》：我們什么時候才不會將人工智能視為一項新奇事物，轉(zhuǎn)而將它看做一項理所當然的主流技術(shù),？

吳恩達：我感覺,，在高德納技術(shù)成熟度曲線上，我們或許正在達到峰值,。對超級人工智能的極度擔憂也許就是這個峰值,，我們正在經(jīng)歷這個階段。實際情況很難預(yù)測,，我也可能是錯的,，但我希望未來對人工智能技術(shù)能少一些炒作，更多專注技術(shù)的進步,。

即便在計算機視覺領(lǐng)域,，雖然培訓(xùn)計算機識別物品、人臉和手寫文字方面,，我們已經(jīng)取得了顯著的進步,，但我們?nèi)绾卧诋a(chǎn)品中整合這項技術(shù)，卻落后于技術(shù)本身的發(fā)展,。

《財富》：所以,，當我們的應(yīng)用或設(shè)備可以識別人類或周圍的物品時，我們還無法做到泰然接受的地步,？

吳恩達：我想在計算機視覺方面,，還需要一段時間，因為目前尚沒有太多計算機視覺產(chǎn)品出現(xiàn),。不過,，我可以分享一種非常有市場前景的應(yīng)用。

在百度的廣告系統(tǒng)中,，如果我們向用戶展示一段文字,，效果很不錯。但有了深度學(xué)習(xí)技術(shù),，它可以幫助廣告商選擇在向用戶提供文本的同時,，展示一個小圖片,。這樣一來,，用戶不需要閱讀一大段關(guān)于在巴厘島度假的文字,，廣告商會展示一張巴厘島的圖片，用戶一瞬間就能理解廣告的意圖,。用戶可以用更快的速度理解廣告在講什么,，這項技術(shù)也將對我們連接用戶和廣告商的能力產(chǎn)生顯著影響。（財富中文網(wǎng)）

譯者：劉進龍/汪皓

審校：任文科

Google and Facebook get a lot of attention in the United States, but tell us about some of what Baidu is working on that’s powered by AI.

One of the things that Baiu did well early on was to create an internal platform for deep learning. What that did was enable engineers all across the company, including people who were not AI researchers, to leverage deep learning in all sorts of creative ways—applications that an AI researcher like me never would have thought of. There’s a very long tail of all sorts of creative products—beyond our core web search, image search and advertising businesses—that are powered by deep learning.

A couple of examples: Our computer security products use deep learning to recognize threats. I wouldn’t have thought of doing that, and I wouldn’t have known how to do it. We use deep learning to try to predict a day in advance when a hard disk is going to fail, and this increases the reliability and reduces the cost of our data centers.

Baidu has also created a Google Glass-like technology, a digital assistant and, I believe, even an intelligent bike. Is there a market for these products, or are they just interesting experiments right now?

I consider them research explorations for now. But based on the feedback we’ve gotten from the community, there is definitely demand for things like smart bikes and wearable cameras.

Actually in China, we demoed a new product called Dulife, which uses computer vision and natural language processing to try to tell blind people what is in front of them. It turns out, for example, that there are several different dollar bills in China that are the same size and blind people have to tell by touch how they’re different. But after a bill has been in circulation for some time, the tactile portions get worn down and it becomes difficult to tell the bill’s denomination. That’s one use case where simple computer vision can tell you if this a ￥20 bill or ￥50 bill. That was something that blind individuals actively requested.

Is the Chinese market where Baidu primarily operates different from the United States or elsewhere with regard to these types of mobile or wearable devices?

The Chinese market is very different. One of the things that I believe is that the biggest, hottest tech trend in China right now is O2O, or online-to-offline.

This concept of O2O refers to using your mobile device to connect you to physical services around you, whether it’s a car wash, food delivery, finding a local discounted movie right around the corner, having someone do your manicure, or hiring a chef who’ll show up in your house to cook for you. There are all these things in the United States, as well, but I think China’s high population density has made it efficient for O2O to rise very quickly.

Also, a lot of users’ first computational device in China is a smartphone. When your first computational device is a cell phone, then you start learning the most efficient ways to use a cell phone right away without having to transition from a laptop.

When will we stop talking about AI as a novel thing, and just start taking it for granted like so many other technologies?

I feel like on the Gartner Hype Cycle, we might be just cresting. I think the peak of fears about AI super-intelligence was maybe the peak, and we’re just getting past that right now. It’s hard to tell, I might be wrong, but I’d love to get to a future where there’s less hype about AI and we’re more focused on making progress.

Even in the field of computer vision, where there has been remarkable progress in training computers to recognize things such as objects, faces and handwriting, I think the technology has got a bit ahead of our ideas for how to integrate it into products.

So we’re not at the point where we can be unimpressed when our apps or devices recognize us or the stuff around us?

I think it will be a while for computer vision, because there just aren’t that many computer vision products yet. But I’ll share one that monetizes well.

It turns out that on Baidu’s advertising system, if you show a user a piece of text, that works OK. But with our deep learning technology, it can help an advertiser select a small image to show the user along with the text. So rather than having you read a bunch of text about a holiday in Bali, maybe it can show a picture of Bali, and you understand it in a fraction of a second. This helps users figure out much more quickly what an ad is about, and has a marked impact on our ability to connect users and advertisers.

撰寫或查看更多觀點, 請打開財富Plus APP

亚色在线观看_亚洲人成a片高清在线观看不卡_亚洲中文无码亚洲人成频_免费在线黄片,69精品视频九九精品视频,美女大黄三级,人人干人人g,全新av网站每日更新播放,亚洲三及片,wwww无码视频,亚洲中文字幕无码一区在线

《財富》APP下載

雜志訂閱

在社交媒體上找到我們

《財富》專訪百度首席科學(xué)家：計算機不會統(tǒng)治世界