大數(shù)據(jù)的局限性
????阿貝斯曼的分析單位是事實(shí),希爾則聚焦于“預(yù)測的有效性”,。希爾擁有良好的風(fēng)度和自我認(rèn)知,,他承認(rèn)人性的弱點(diǎn)是一種設(shè)計(jì)約束?!暗艺J(rèn)為,,我們的信念永遠(yuǎn)不能達(dá)到完美的客觀性,合理性和準(zhǔn)確性,,”希爾寫道,。“相反,,我們可以力爭少一點(diǎn)主觀性,、少一點(diǎn)不合理性、少犯一點(diǎn)錯(cuò)誤,。根據(jù)我們的信念作出預(yù)測,,是進(jìn)行自我測試的最佳(或許也是唯一的)方式。如果客觀性關(guān)系到一個(gè)更大的超越我們自身?xiàng)l件的真理,,那么預(yù)測就是審視我們個(gè)人看法與那個(gè)更大真理之間的聯(lián)系究竟有多么密切的最佳方式,,最客觀的往往是那些做出最準(zhǔn)確預(yù)測的人?!?/p> ????然而,,我想知道的是,希爾是否充分意識(shí)到,,他將警示故事與令人震驚的失敗混合在一起的做法,,可能會(huì)對將其報(bào)道銘記于心的讀者產(chǎn)生累積效應(yīng)。他提供了一個(gè)又一個(gè)例子來說明,,帶有缺陷和偏見的人,,使用帶有缺陷和偏見的方式,,構(gòu)建出帶有缺陷和偏見的模型。他非常出色地反復(fù)闡述了“過度擬合的”統(tǒng)計(jì)模型,。希爾解釋稱,,為了適應(yīng)數(shù)據(jù),統(tǒng)計(jì)學(xué)家們竭力調(diào)試自己的模型,,最終往往大大降低了這些模型的準(zhǔn)確性,,進(jìn)而無法用其進(jìn)行可靠的預(yù)測。 ????希爾的故事為現(xiàn)在的預(yù)測模型構(gòu)建者提供了一個(gè)公平的樣本,。就這一點(diǎn)而言,,這本書預(yù)測稱,未來的新世界將充斥著許多由統(tǒng)計(jì)數(shù)據(jù)驅(qū)動(dòng)的成功案例,,既不快樂,,也不勇敢。在這個(gè)世界中,,平均表現(xiàn)距離世界級水準(zhǔn)或許相差好幾個(gè)標(biāo)準(zhǔn)差,。 ????希爾引用了菲利浦?泰洛克對專家意見所進(jìn)行的經(jīng)典研究。這項(xiàng)研究顯示,,數(shù)量多得令人不安的專業(yè)領(lǐng)域的“專家”在預(yù)測可能結(jié)果方面的表現(xiàn)往往差得離譜,。此外,專家們往往對其預(yù)測的質(zhì)量過度自信,,簡言之,,專家意見時(shí)常獲得兩個(gè)世界的最差結(jié)果:以妄自尊大的態(tài)度給出了錯(cuò)誤答案。這不是成功的秘訣,。 ????從IBM的超級電腦Watson,,谷歌(Google)的搜索算法,到亞馬遜網(wǎng)站(Amazon)的推薦引擎,,數(shù)據(jù)驅(qū)動(dòng)的計(jì)算系統(tǒng)無疑能夠獲得非凡的成功,,特別是當(dāng)它們專注于現(xiàn)實(shí)生活測試,而不是抽象理論的時(shí)候,?!罢嬲谩髷?shù)據(jù)的公司,,比如谷歌,,并沒有將大量時(shí)間花在構(gòu)建模型上,”希爾寫道,?!斑@些公司每年從事數(shù)十萬次實(shí)驗(yàn),在真實(shí)的顧客身上測試自己的想法,?!?/p> ????然而,,讀完這兩部著作,我們可以得出一個(gè)頗具諷刺意味的結(jié)論:一個(gè)人獲得的數(shù)據(jù)和事實(shí)越多,,預(yù)測就越有意義,,人的判斷也就顯得愈發(fā)重要。人類,、數(shù)據(jù)集和算法的協(xié)同進(jìn)化將最終決定“大數(shù)據(jù)”究竟是會(huì)創(chuàng)造新財(cái)富,,還是會(huì)摧毀舊價(jià)值。 ????本文作者邁克爾?施拉格是麻省理工學(xué)院斯隆管理學(xué)院數(shù)字商務(wù)研究中心(MIT Sloan School's Center for Digital Business)研究員,,曾經(jīng)擔(dān)任《財(cái)富》雜志(Fortune )專欄作家,,著有《你想讓你的客戶變成什么樣的人?》(Who Do You Want Your Customers To Become,?)一書,。 ????譯者:任文科 |
????Where Arbesman's unit of analysis is the fact, Silver focuses on "predictive validity." He has the good grace and self-awareness to accept human frailty as a design constraint. "But I'm of the view we can never achieve perfect objectivity, rationality or accuracy in our beliefs," Silver writes. ????"Instead we can strive to be less subjective, less irrational and less wrong. [emphasis in original] Making predictions based on our beliefs is the best (and perhaps only) way to test ourselves. If objectivity is the concern for a greater truth beyond our personal circumstances, and prediction is the best way to examine how closely aligned our personal perceptions are with that greater truth, the most objective among us are those who make the most accurate predictions." ????I wonder, however, if Silver is fully aware of the cumulative effect his mix of cautionary tales and shocking failures might have on readers who take his reporting to heart and mind. He provides example after example of flawed and biased human beings building flawed and biased models and using them flawed and biased ways. He provides a superb riff on "overfitting" in statistical models. By trying to get their models to fit the data a little too well, Silver explains, statisticians all too frequently end up making them far less accurate and reliable for prediction. ????To the extent that Silver's stories present a fair sampling of today's predictive modelers, this book predicts neither a happy nor brave new world of statistics-driven success. In this world, average performance may prove to be quite a few standard deviations away from world class. ????Silver cites Philip Tetlock's classic study of expertise, which shows that "experts" in a disconcerting number of disciplines are disproportionately worse than chance at predicting likely outcomes. Experts also tend to be disproportionately overconfident about the quality of their predictions. In short, expertise frequently yields the worst of both worlds: wrong answers stewed in arrogance. This is not a recipe for success. ????Between IBM's Jeopardy-winning Watson, Google's search algorithms and Amazon's recommendation engines, there's no doubt that data-driven computational systems can enjoy remarkable success, particularly when they focus on real-life testing rather than abstract theory. "Companies that really 'get' Big Data, like Google, aren't spending a lot of time in model land," Silver writes. "They're running hundreds of thousands of experiments every year and testing their ideas on real customers." ????The ironic takeaway from both these fine books, however, is that the more data and facts one has, and the more predictions matter, the more important human judgment becomes. The co-evolution of human beings, datasets, and algorithms will ultimately determine whether Big Data creates new wealth or destroys old value. ????A research fellow at MIT Sloan School's Center for Digital Business, Michael Schrage is a former Fortune columnist and the author of Who Do You Want Your Customers To Become? |
最新文章
最新文章:
中國煤業(yè)大遷徙
中國 | 美國 | 日本 | 法國 |
德國 | 英國 | 瑞士 | 韓國 |
荷蘭 | 加拿大 | 印度 | 巴西 |
意大利 | 澳大利亞 | 俄羅斯 | 西班牙 |
能源 | 金融 | 汽車相關(guān) |
IT行業(yè) | 商業(yè),、零售 | 房地產(chǎn),、建筑 |
金屬產(chǎn)品 | 航空、航天 | 食品相關(guān) |
電信 | 保險(xiǎn)行業(yè) | 鐵路運(yùn)輸 |