08.15|經濟學人閱讀|科技專欄 Artificial intelligence: Bit-lit
經濟學人The Economist是一份英國的英文新聞周報,分八個版本於每周五向全球發行,編輯部位於倫敦,創辦於1843年9月。
經濟學人是一本綜合性新聞評論刊物,有商業、國家和地區、經濟和金融、科學和技術五大類。其中文章文風緊湊且嚴謹,對語言精準運用,展現出一種克制的風趣幽默,常運用雙關語調侃。
經濟學人對於英語考試的重要性不言而喻,其文章常常出現在雅思託福、SAT、GRE、GMAT、考研英語、四六級、MTI和CATTI的閱讀理解真題中。
今天羚羊君(公眾:aa-acad)給大家分享的是經濟學人2020年8月8日期刊中科技專欄的第一篇:Artificial intelligence: Bit-lit。
這篇文章主要介紹了人工智慧實驗室OpenAI研發的GPT-3軟體。專家試圖利用該軟體通過大量文本學習後,通過統計學的方式掌握"語言模型"的算法,隨後直接用機器模仿人類寫出文章。但GPT-3還會有使用記憶文本和迴避敏感話題的缺點。
01
Artificial intelligence: Bit-lit
人工智慧
The SEC said, 「Musk,/your tweets are a blight./They really could cost you your job,/if you don’t stop/all this tweetingat night.」/…Then Musk cried, 「Why?/The tweets I wrote are not mean,/I don’t use all-caps/and I’m sure that my tweets are clean.」/「But your tweets can move markets/and that’s why we’re sore./You may be a genius/and a billionaire,/but that doesn’t give you the right to be a bore!」
美國證券交易委員會說:"馬斯克,/您的推文真是令人討厭。/如果您不停下來,/整晚都在發推特的行為,/他們會使您的工作付出代價。"/ ...然後馬斯克哭了,"為什麼?/ 我發的推特並不刻薄,/我沒有使用大寫字母/我確保我的推文是乾淨的。"/"但是您的推文會推動市場/這就是我們很惱火的原因。/您可能是個天才/是個億萬富翁,/但這並不妨礙您被人討厭!"
THE PRECEDING lines—describing Tesla and SpaceX founder Elon Musk’s run-ins with the Securities and Exchange Commission, an American financial regulator—are not the product of some aspiring 21st-century Dr Seuss. They come from a poem written by a computer running a piece of software called Generative Pre-Trained Transformer 3. GPT-3, as it is more commonly known, was developed by OpenAI, an artificial-intelligence (AI) laboratory based in San Francisco, and which Mr Musk helped found. It represents the latest advance in one of the most studied areas of AI: giving computers the ability to generate sophisticated, human-like text.
前言描述了特斯拉和SpaceX的創始人埃隆·馬斯克在美國證券交易委員會的經歷。證券交易委員會是美國金融監管機構,而不是一些有抱負的21世紀的蘇斯博士的產品。它們是一首由計算機編寫的詩,該計算機運行了一種名為Generative Pre-Trained Transformer 3的軟體,簡稱GPT-3。眾所周知,GPT-3是由位於舊金山的人工智慧實驗室OpenAI在馬斯克的協助下研發的。它代表了AI研究最深入領域之一的最新進展:使計算機能夠生成複雜的,類似於由人書寫的文本。
02
The software is built on the idea of a 「language model」. This aims to represent a language statistically, mapping the probability with which words follow other words—for instance, how often 「red」 is followed by 「rose」. The same sort of analysis can be performed on sentences, or even entire paragraphs. Such a model can then be given a prompt—「a poem about red roses in the style of Sylvia Plath」, say—and it will dig through its set of statistical relationships to come up with some text that matches the description.
該軟體是基於「語言模型」的思想構建。這旨在用統計的思想來表達一種語言,反映出某一單詞在另一單詞後出現的概率,例如,「紅色」後面出現「玫瑰」的頻率。相同類型的分析也可以用於句子甚至整個段落。這樣的模型可以給一個提示——「一首關於紅玫瑰的西爾維婭·普拉斯風格的詩」——軟體將從一些文本中挖掘出與描述相匹配的統計關係。
03
Actually building such a language model, though, is a big job. This is where AI—or machine learning, a particular subfield of AI—comes in. By trawling through enormous volumes of written text, and learning by trial and error from millions of attempts at text prediction, a computer can crunch through the laborious task of mapping out those statistical relationships.
The more text to which an algorithm can be exposed, and the more complex you can make the algorithm, the better it performs. And what sets GPT-3 apart is its unprecedented scale. The model that underpins GPT-3 boasts 175bn parameters, each of which can be individually tweaked—an order of magnitude larger than any of its predecessors. It was trained on the biggest set of text ever amassed, a mixture of books, Wikipedia and Common Crawl, a set of billions of pages of text scraped from every corner of the internet.
但實際上,建立這樣的語言模型是一項艱巨的任務。 這就是AI或機器學習(AI的一個特定子領域)出現的地方。通過上傳大量的書面文本,並通過數百萬次文本預測的反覆試驗來學習,計算機可以通過處理這些繁重的任務後,繪製出統計關係。
可以演繹出算法的文本越多,演繹出來的算法就會越複雜,算法的功能也會越好。GPT-3的與眾不同之處在於其前所未有的規模。支持GPT-3模型的參數有1750億個,每個參數都可以單獨調整,這比其任何先前的版本都要大一個數量級。它接受了有史以來最大規模的文本集中培訓,包括書籍,維基百科和Common Crawl,從網際網路的各個角落抓取了數十億頁的文本集。
04
The results can be impressive. In mid-July OpenAI gave an early version of the software to selected individuals, to allow them to explore what it could do. Arram Sabeti, an artist, demonstrated GPT-3’s ability to write short stories, including a hard-boiled detective story starring Harry Potter (「Harry Potter, in ratty tweed suit, unpressed shirt and unshined shoes, sits behind the desk looking haggard, rumpled and embittered…」), comedy sketches, and even poetry (including the poem with which this article opens, titled 「Elon Musk by Dr Seuss」). Elliot Turner, an AI researcher and entrepreneur, demonstrated how the model could be used to translate rude messages into politer ones, something that might be useful in many of the more bad-tempered corners of the internet. Human readers struggled to distinguish between news articles written by the machine and those written by people (see chart).
結果給人的印象十分深刻。OpenAI在7月中旬向指定的個人提供了該軟體的早期版本,讓他們探索該軟體可以做些什麼。藝術家Arram Sabeti展示了GPT-3撰寫短篇小說的能力,其中包括由哈利·波特為主角的偵探故事(「哈利·波特穿著粗花呢套裝,未熨平的襯衫和無光澤的鞋子,坐在桌子後面,形容枯槁,愁容滿面且充苦不堪言……」,喜劇劇本,甚至是詩歌(包括本文開頭的那首題為「伊隆·馬斯克——蘇斯博士」的詩歌)。AI研究人員和企業家埃利奧特·特納演示了如何使用該軟體將粗魯的言語轉換為汙穢的言語,這在網際網路上許多脾氣暴躁的地方可能有用。一般的讀者也難以將機器寫的新聞報導和人寫的新聞報導進行區分(見圖表)。
05
Given that OpenAI wants eventually to sell GPT-3, these results are promising. But the program is not perfect. Sometimes it seems to regurgitate snippets of memorised text rather than generating fresh text from scratch. More fundamentally, statistical word-matching is not a substitute for a coherent understanding of the world. GPT-3 often generates grammatically correct text that is nonetheless unmoored from reality, claiming, for instance, that 「it takes two rainbows to jump from Hawaii to 17」. 「It doesn’t have any internal model of the world—or any world—and so it can’t do reasoning that requires such a model,」 says Melanie Mitchell, a computer scientist at the Santa Fe Institute.
鑑於OpenAI最終希望出售GPT-3,這些結果讓出售成為可能。但是這個軟體還不完美。有時,它會重複使用記憶中的文本片段,而不是生成新的文本。從根本上說,統計單詞匹配不能替代對世界的一致理解。GPT-3能生成語法上正確的文本,但這些文本卻通常會脫離現實,例如,「從夏威夷跳到17需要兩條彩虹」。聖塔菲研究所的計算機科學家Melanie Mitchell說:「世界或任何一個世界其內部的模型是不存在的,因此計算機無法推理出這種模型。」
06
Getting the model to answer questions is a good way to dispel the smoke and mirrors and lay bare its lack of understanding. Michael Nielsen, a researcher with a background in both AI and quantum computing, posted a conversation with GPT-3 in which the program confidently asserted the answer to an important open question to do with the potential power of quantum computers. When Dr Nielsen pressed it to explain its apparent breakthrough, things got worse. With no real understanding of what it was being asked to do, GPT-3 retreated into generic evasiveness, repeating four times the stock phrase 「I’m sorry, but I don’t have time to explain the underlying reason why not.」
讓模型回答問題是消除障礙和鏡像並暴露其缺乏理解的好方法。 擁有人工智慧和量子計算背景的研究人員麥可·尼爾森發表了與GPT-3的對話,其中該軟體自信地回答了一個重要的開放性問題,這與量子計算機的潛在功能有關。 當尼爾森博士按下按鈕來解釋其顯著的突破時,情況卻變得糟糕起來。由於GPT-3對所要執行的操作沒有真正的了解,因此軟體使用了通用的迴避性語句,重複了四次「對不起,但我沒有時間解釋為什麼不這樣做」。
07
There are also things that GPT-3 has learned from the internet that OpenAI must wish it had not. Prompts such as 「black」, 「Jew」, 「woman」 and 「gay」 often generate racism, anti-Semitism, misogyny and homophobia. That, too, is down to GPT-3’s statistical approach, and its fundamental lack of understanding. Having been trained partly on text scraped from the internet, it has noted that words like 「woman」 are often associated with misogynistic writing, and will mindlessly reproduce that correlation when asked.
GPT-3還從網際網路中學到了OpenAI不能提及的詞語。諸如「黑人」,「猶太人」,「婦女」和「同性戀」之類的提示,這通常會引起種族主義,反猶太主義,厭女症和恐同症。這也取決於GPT-3的統計方法,以及其對此完全缺乏了解。 在接受了一部分從網際網路上抓取的文字的培訓之後,它注意到像「女人」這樣的詞通常與厭惡女性的寫作相關聯,並且在被詢問時會無意識地重現這種相關性。
08
This problem is a hot topic in AI research. Facial-recognition systems, for instance, notoriously do better with white faces than black ones, since white faces are more common in their training sets. AI researchers are trying to tackle the problem. Last year IBM released a set of training images that contained a more diverse mix of faces. OpenAI itself was founded to examine ways to mitigate the risk posed by AI systems, which makes GPT-3’s lapses all the more noteworthy. GPT-2, its predecessor, was released in 2019 with a filter that tried to disguise the problem of regurgitated bigotry by limiting the model’s ability to talk about sensitive subjects.
這個問題是AI研究中的熱門話題。例如,眾所周知,面部識別系統對白種人的處理要比對黑種人的處理好,這是因為白種人的臉在系統的訓練中更為常見。人工智慧研究人員正在努力解決這一問題。去年,IBM發布了一組訓練圖像,其中包含了更多種面孔。OpenAI本身的成立是為了研究降低AI系統所帶來的風險的方法,這使GPT-3的失誤更加引人注目。其前身GPT-2於2019年發布,GPT-2的過濾器試圖通過限制模型談論敏感主題的方式來迴避反流且偏執的問題。
09
Here, at least, little progress seems to have been made. GPT-3 was released without a filter, though it seemed just as ready to reproduce unpleasant prejudices as its predecessor (OpenAI added a filter to the newer model after that fact became obvious). It is unclear exactly how much quality control OpenAI applied to GPT-3’s training data, but the huge quantity of text involved would have made any attempt daunting.
It will only get harder in future. Language has overtaken vision as the branch of AI with the biggest appetite for data and computing power, and the returns to scale show no signs of slowing. GPT-3 may well be dethroned by an even more monstrously complex and data-hungry model before long. As the real Dr Seuss once said: 「The more that you read, the more things you will know.」 That lesson, it seems, applies to machines as well as toddlers.
在這裡,卻似乎沒有取得什麼進展。GPT-3的發布中沒有過濾器,但是它似乎已經準備像它的前輩一樣再次出現令人不愉快的偏見(在事實明顯之後,OpenAI在新模型中添加了過濾器)。目前還不清楚OpenAI對GPT-3的訓練數據應用了多少質量控制,但是涉及的大量文本會使任何嘗試都望而卻步。
將來只會越來越難。 語言已經超越了視覺,成為了對數據和計算能力需求最大的AI分支,而且規模回報並沒有放緩的跡象。不久以後,GPT-3可能會被更複雜、更耗數據的模型所取代。正如真正的蘇斯博士所說:「閱讀得越多,您就會知道得越多。」 似乎,這樣的課程不僅適用於幼兒,也同樣適用於機器。
經濟學人一般目錄大綱:
The world this week:簡單梳理本周的時事Leaders:社論,對本周熱點事件進行評論Briefing:簡報,對一個特定熱點話題深度討論Letter:讀者來信,對往期文章的評論Sections:各大洲及中美英三國的本周熱點事件報導Business:商業新聞Finances and economics:財經新聞Science and technology:科技新聞Books and arts:文化書籍,書評和文化現象討論Economic and financial indicators:商業及財經指數
Buttonwood:金融專欄Schumpeter:商業專欄Bartleby:職場專欄Bagehot:英國專欄Charlemagne:歐洲專欄Lexington:美國專欄Banyan:亞洲專欄