首页 / 财富中文网 / 正文

“人工智能寒冬”即将来临?过往低谷期可提供经验

财富中文网 2025-09-07 20:05:25

“人工智能寒冬”即将来临?过往低谷期可提供经验
图片来源:Photo illustration by Getty Images

夏日渐去、秋日将至,众多科技界人士开始担忧寒冬的到来。上月末,彭博社专栏作家提出疑问:“人工智能寒冬终于要来了吗?”英国《每日电讯报》则态度更为笃定:“下一轮人工智能寒冬即将来临”。与此同时,社交平台X上关于“人工智能寒冬或将来临”的讨论也甚嚣尘上。

“人工智能寒冬”是人工智能领域从业者用以指代特定时期的称谓:这一时期,大众对“机器能如人类般学习、思考”这一理念的热情渐趋冷却,对人工智能产品、企业及研究的投资也随之枯竭。这一词汇之所以频繁从人工智能评论员的口中说出,实则有其深刻缘由——在长达70年的人工智能研究历史中,我们已经历过数次“寒冬”。倘若如部分人所忧虑的那样,我们即将步入新一轮“寒冬”,那么这至少将是第四次。

近期关于寒冬将至的讨论,根源在于投资者愈发忧虑人工智能技术可能无法达成炒作营造出的预期,且诸多人工智能相关公司估值过高。在最糟糕的情况下,这场人工智能寒冬可能伴随着人工智能概念催生的股市泡沫的破裂,进而对整个经济产生影响。虽然此前也曾经历人工智能炒作周期,但从未有哪次像本轮生成式人工智能热潮这样,吸引投资者投入高达数千亿美元资金。因此,若新一轮“寒冬”到来,其冲击力或将如极地涡旋般猛烈。

近期OpenAI首席执行官萨姆·奥尔特曼(Sam Altman)的言论引发了市场恐慌。他向记者坦言部分风投支持的人工智能初创企业估值严重虚高(当然,OpenAI不在此列,它是史上估值最高的风投支持型初创企业之一)。随后,麻省理工学院发布的研究报告指出95%的人工智能试点项目以失败告终。

回顾过往的人工智能寒冬及其成因,或许能帮助我们判断当前空气中的“寒意”究竟只是一阵短暂的凉风,还是“冰河时代”即将来临的先兆。有时人工智能寒冬是由学术研究揭示特定人工智能技术的局限性引发的;有时则因人工智能技术在现实应用中屡屡受挫;有时两者兼而有之。但历次人工智能寒冬的共同之处在于:当承载厚望的新进展未能兑现炒作所赋予的过高期待时,出资方便会陷入幻灭。

第一轮人工智能炒作周期

冷战初期,美国及其盟国政府便在人工智能研究领域投入了巨额资金。彼时与当下情形一样,美国政府将这项技术视为可能带来战略和军事优势的领域,因此美国国防部提供了大部分人工智能研究经费。

当时,人工智能领域存在两种对立的方法论。其一,是借助硬编码逻辑规则,将输入数据分类为符号,再通过对这些符号进行操控来得出输出结果。依靠此方法,计算机在跳棋、国际象棋领域首次取得重大突破,世界上首批聊天机器人也由此诞生。

另一种方法则基于感知器技术——即当今神经网络的前身,是大致模仿大脑运行机制的人工智能。感知器并非从规则和逻辑出发,而是通过数据学习归纳完成特定任务的规则。美国海军研究办公室为感知器的早期研究提供了大量资金支持,而康奈尔大学神经科学家兼心理学家弗兰克·罗森布拉特(Frank Rosenblatt)是该技术的开创者。美国海军和中央情报局(CIA)均对感知器进行了测试,试图验证其能否对目标进行分类——例如识别敌舰轮廓,或辨别航空侦察照片中的潜在目标。

两大对立阵营都高调宣称,其技术将迅速催生出与人类智力相当甚至超越人类智力的计算机。1958年,罗森布拉特在接受《纽约时报》采访时表示,他研发的感知器很快就能识别人脸并喊出其姓名,距离实现即时语言翻译“仅一步之遥”,最终人工智能系统还将具备自我复制能力并拥有意识。与此同时,麻省理工学院人工智能实验室联合创始人、符号人工智能阵营领军人物马文·明斯基(Marvin Minsky)在1970年接受《生活》杂志采访时宣称:“未来三到八年内,我们将拥有具备普通人类通用智能的机器。”

这正是人工智能寒冬的首要前提:炒作。如今多位人工智能领域知名人士的言论与彼时存在明显的相似之处。今年1月,OpenAI首席执行官萨姆·奥尔特曼在其个人博客中写道:“我们如今笃定——已掌握构建具备传统意义上人类水平的通用人工智能的方法”,并表示OpenAI正日益将重心转向研发超越人类的“超级智能”。他还称,今年“我们可能见证首批人工智能代理'加入劳动力队伍',并切实改变企业的产出”。Anthropic联合创始人兼首席执行官达里奥·阿莫迪(Dario Amodei)曾预测,具备人类水平的人工智能将于2026年问世。与此同时,谷歌DeepMind联合创始人兼首席执行官戴密斯·哈萨比斯(Demis Hassabis)则表示,在所有认知领域均达到人类水平的人工智能将在未来“五到十年内”诞生。

政府失去信心

但引发人工智能寒冬的,是部分确凿证据表明炒作的愿景无法兑现。第一次寒冬的爆发源于一系列沉重打击:1966年,受美国国家研究委员会(National Research Council)委托的委员会发布了一份关于自然语言处理和机器翻译现状的负面报告,结论是计算机翻译比人工翻译成本更高、速度更慢且准确性更低。该委员会此前已为早期语言人工智能研究投入2000万美元(按如今币值计算,至少相当于2亿美元),随后便停止了所有资金支持。

随后在1969年,明斯基又挥出第二记重拳。这一年,他与人工智能研究者西蒙·派珀特(Seymour Papert)合著的专著对感知器进行了全面批判。在书中,明斯基与派珀特通过数学论证证明:单层感知器(如罗森布拉特1958年高调展示的那种)仅能进行精确的二元分类——换言之,它只能识别物体是黑是白、是圆是方,却无法将事物归入两个以上的类别。

事后证明,明斯基与派珀特的批判存在重大缺陷。尽管多数人将此书视为基于神经网络的人工智能永远无法企及人类智力水平的决定性证据,但他们的论证仅适用于结构简单的单层感知器:输入层由若干接收数据的神经元构成,且所有输入层神经元仅与一个输出层神经元相连。他们很可能刻意忽略了这样一个事实:早在1960年代,部分研究者已开始探索多层感知器——这种感知器在输入层神经元与输出层神经元之间增设了一个由神经元构成的中间“隐藏层”。作为当今“深度学习”技术的真正前身,多层感知器实际上具备将数据归入两个以上类别的能力。但当时训练这种多层神经网络难度极大。而这已无关紧要——损害已然造成。明斯基与派珀特的著作出版后,美国政府对基于神经网络的人工智能方法的资金支持基本终止。

明斯基与派珀特的批判不仅说服了美国国防部的资助机构,还让众多计算机科学家相信神经网络研究已走入死胡同。部分神经网络研究者甚至指责明斯基使该领域的发展倒退了数十年。2006年,曾助力重新点燃神经网络研究热情的研究员特伦斯·谢诺夫斯基(Terry Sjenowski)在一次会议上公开质问明斯基:“你是魔鬼吗?”明斯基无视提问,转而详细阐述他眼中神经网络存在的缺陷。谢诺夫斯基继续追问,恼怒的明斯基大声回应道:“没错,我就是!”

但明斯基代表的符号人工智能,很快也面临资金短缺的困境。同样是在1969年,美国国会强制要求曾为两种人工智能研究方法提供大量资金支持的美国国防部高级研究计划局(Defense Advanced Research Project Agency,DARPA)改变拨款方式。该机构被告知要资助那些具有明确军事应用场景的研究项目,而非更侧重理论探索的“蓝天研究”(指无明确实用目标、纯基础领域的研究)。尽管部分符号人工智能研究符合这一标准,但大多数研究并不符合。

1973年,致命一击降临:英国议会委托剑桥大学数学家詹姆斯·莱特希尔(James Lighthill)对英国人工智能研究现状展开调查。他在结论中指出,在实现与人类智力水平相当这一宏大目标上,人工智能未能显露出任何希望,其推崇的诸多算法虽能解决“玩具级问题”(指简单模拟场景中的问题),却永远无法应对现实世界的复杂性。基于莱特希尔的这一结论,英国政府终止了对人工智能研究的所有资金支持。

尽管莱特希尔的调查仅聚焦于英国的人工智能研究,但美国国防部高级研究计划局以及其他资助人工智能研究的美国机构均注意到了这一结论,这进一步加深了他们对人工智能的怀疑态度。到1974年,美国对人工智能项目的资助额仅为1960年代的零头。人工智能寒冬就此降临,并一直持续到20世纪80年代初。

如今,当研究表明人工智能未能达到预期时,也出现了与第一次人工智能寒冬相似的情形。苹果公司与亚利桑那州立大学近期发表的两篇研究论文,对前沿人工智能模型是否真正具备推理能力提出质疑——这些模型本应通过“思维链”推理如何回应提示词。两篇论文均得出一致结论:这些模型并未像人类理解的推理那样,学习如何将可泛化的逻辑规则和问题解决技巧用于解决新问题,而仅仅是试图将当前问题与训练数据中出现过的问题进行匹配。这些研究或许会成为当代版“明斯基与派珀特批判感知器”的标志性事件。

与此同时,关于当前人工智能模型实际影响的研究正日益增多,这类研究与莱特希尔报告及美国国家研究委员会的报告类似。例如,麻省理工学院的一项研究得出结论,95%的人工智能试点项目未能推动企业营收增长。赛富时(Salesforce)研究人员近期发布的研究发现,当前多数大型语言模型无法准确执行客户关系管理(CRM)任务——这一结论颇具讽刺意味,因为赛富时自身正大力推广人工智能代理,以实现客户关系管理流程自动化。Anthropic的研究表明,其Claude模型无法成功运营自动售货机业务——相较于科技鼓吹者宣称将被人工智能代理“彻底颠覆”的众多业务,这已是相对简单的业务。人工智能研究机构METR的研究还揭示:实际上,相较于不借助人工智能编程助手的情况,使用这类工具的软件开发人员,完成任务的速度降低19%。

但存在部分关键差异。最显著的是,当前的人工智能热潮并不依赖公共资金。尽管包括美国军方在内的政府机构正成为人工智能企业的重要客户,但推动当前热潮的资金几乎完全来自私营领域。自2022年11月ChatGPT推出以来,风险投资机构已向人工智能初创企业投入至少2500亿美元。这还不包括微软、谷歌母公司Alphabet、亚马逊和Meta等大型上市科技公司在自身人工智能项目上的巨额投入。仅今年一年,用于建设人工智能数据中心的支出就高达3500亿美元,预计明年这一数字还会进一步攀升。

此外,与第一次人工智能寒冬时期人工智能系统主要停留在研究实验阶段不同,如今人工智能已在各行业广泛部署。人工智能还成为一项规模庞大的消费技术——仅ChatGPT的周用户量就达7亿——这在以往是从未有过的情况。尽管当今的人工智能似乎仍缺乏人类智能的某些关键要素,但相较于过去的人工智能系统已有显著进步,而且人们确实发现这项技术在大量任务中具有实用价值,这一点毋庸置疑。

第二次人工智能寒冬:企业失去耐心

第一次人工智能寒冬在20世纪80年代初逐渐消退,这主要归功于计算能力的提升和算法技术的改进。这一时期,人工智能领域的炒作主要集中在“专家系统”上——这类计算机程序旨在将特定领域人类专家的知识编码为逻辑规则集,软件根据这些规则执行特定任务。

尽管如此,企业界仍热情高涨,认为专家系统将推动生产力大幅提升。在这轮人工智能炒作周期的鼎盛阶段,近三分之二的《财富》美国500强企业宣称已部署专家系统。到1985年,美国企业在这方面的总投入已超过10亿美元,围绕该技术的完整产业也应运而生,其中大部分得到了风险投资的支持。大部分资金用于研发名为LISP机的专用计算机硬件,这些硬件经过优化可运行专家系统——其中许多系统正是用LISP编程语言编写的。此外,自1983年起,美国国防高级研究计划局通过新推出的“战略计算计划”重新资助人工智能研究,最终向全美多所大学的90余个人工智能项目投入逾1亿美元资金。

尽管专家系统借鉴了符号人工智能研究者开创的诸多方法,但许多计算机科学领域的学者担忧,过高的期望值将再次引发“繁荣-萧条”周期,进而对该领域的发展造成损害。明斯基和人工智能研究学者罗杰·尚克(Roger Schank)在1984年的一场人工智能会议上创造了“人工智能寒冬”这一术语。他们选用这个新词,意在呼应“核冬天”——大规模核战争后可能出现的、不见天日的毁灭性萧条时期。

随后发生的三件事引发了新一轮寒冬。1987年,太阳计算机系统公司(Sun Microsystems)推出新型计算机工作站。这类工作站,以及IBM和苹果推出的性能日益强大的台式机,使得专用LISP机变得不再必要。不到一年时间,LISP机市场便彻底崩塌。许多风险投资机构血本无归,从此对人工智能初创企业避之不及。同年,纽约大学计算机科学家杰克·施瓦茨(Jack Schwartz)出任美国国防部高级研究计划局计算研究部门负责人。他向来对人工智能持否定态度,尤其反对专家系统,随即大幅削减相关经费。

与此同时,企业逐渐发现专家系统的构建与维护成本高昂且难度极大。这类系统还存在“脆弱性”——虽能高效处理高度常规化任务,但遇到稍有异常的情况,就难以应用预设的逻辑规则。此时,系统往往会输出怪异且不准确的结果,甚至直接彻底崩溃。事实证明,要制定出能覆盖所有极端情况的规则,是一项不可能完成的任务。因此到20世纪90年代初,企业开始放弃专家系统。与首次人工智能热潮中科学家和政府资助方对技术产生质疑不同,第二次寒冬的主要推手是企业的失望情绪。

如今人工智能领域的发展,与彼时存在明显的相似之处。例如,微软、Alphabet、亚马逊云科技、埃隆·马斯克的X.ai以及Meta正斥资数千亿美元建设人工智能数据中心。OpenAI正与软银、甲骨文及其他投资者共同推进耗资5000亿美元的“星门计划”数据中心项目。英伟达之所以能凭借4.3万亿美元市值成为全球市值最高的公司,很大程度上是因为其生产的人工智能芯片满足了数据中心的需求。数据中心热潮背后的核心假设之一是:最前沿的人工智能模型,其规模即便不比现有顶尖模型更大,至少也会与之相当。而训练和运行这类规模的模型,需要极其庞大的数据中心支持。

然而与此同时,多家初创企业已找到巧妙方法,成功研发出规模小得多却能模拟大型模型诸多功能的模型,且所需计算资源远少于后者,有些甚至无需使用英伟达生产的专用人工智能芯片,规模小到可在智能手机上运行。若这一趋势持续下去,那些巨型数据中心可能会变得不再必要——就像当年LISP机被证明并非必需品一样。这意味着,投入人工智能基础设施的数千亿美元资金,最终可能沦为“搁浅资产”。

当今的人工智能系统在诸多方面比20世纪80年代的专家系统更强大、更灵活。但企业仍发现其部署过程复杂且成本高昂,投资回报往往难以捉摸。尽管当下的人工智能模型比专家系统更具通用性与韧性,但依旧不可靠,尤其是在处理训练数据未充分覆盖的特殊案例时。它们容易产生幻觉,会笃定地输出错误信息,有时甚至会犯人类绝不会犯的错误。这意味着企业和政府无法将人工智能用于关键任务流程自动化。企业是否会像当年对专家系统那样,对生成式人工智能和大型语言模型失去耐心,目前尚难预料,但这种情况确实存在发生的可能性。

第三次人工智能寒冬:神经网络的兴衰与复兴

20世纪80年代,另一种人工智能方法——神经网络也重新引发关注,这在一定程度上得益于大卫·莱姆哈特(David Rumelhart)、杰弗里·辛顿(Geoffrey Hinton)和罗纳德·威廉姆斯(Ronald Williams)的研究。1986年,他们成功找到了破解自20世纪60年代以来便一直困扰多层感知器的关键难题的方法。他们的创新成果被称为反向传播(backpropagation,简称backprop),这种方法能在每次训练过程中对中间“隐藏层”神经元的输出结果进行修正,从而让整个神经网络实现高效学习。

反向传播算法,再加上性能更强大的计算机,共同推动了神经网络的复兴。很快,研究人员构建的多层神经网络便具备多种能力:能识别信封和支票上的手写字母、分析家谱中人物的亲属关系、识别打印字符并通过语音合成器朗读,甚至能为早期自动驾驶汽车导航,使其保持在高速公路车道内行驶。

这在20世纪80年代末引发了短暂的神经网络热潮。但神经网络也存在显著缺陷:训练过程需要海量数据,而许多任务根本无法获取所需的海量数据;在当时的计算机硬件条件下,训练速度极慢,有时运行过程中会出现迟滞。

这意味着神经网络仍存在大量无法完成的任务。与当初企业争先恐后地采用专家系统不同,如今企业并未急于引入神经网络——因其应用场景似乎极为受限。与此同时,其他统计机器学习技术正快速取得进展,这些技术所需数据量更少、对计算能力要求更低。如此一来,许多人工智能研究者和工程师再次对神经网络失去信心,又一个长达十年的人工智能寒冬来临。

推动第三次寒冬回暖,有两大因素发挥作用:其一,互联网产生了海量数字数据,且获取这些数据变得相对轻松,这解决了20世纪80年代神经网络发展面临的数据瓶颈问题;其二,自2004年起,先是马里兰大学的研究者,随后是微软的研究者,开始尝试使用“专为电子游戏设计的新型计算机芯片”——图形处理器(GPU)——来训练和运行神经网络。图形处理器具备并行执行大量相同运算的能力,而这恰恰契合了神经网络的运算需求。很快,杰弗里·辛顿及其研究生证明:基于海量数据集训练的、在图形处理器上运行的神经网络,能够完成诸如将图像分类为上千种类别等任务——这在20世纪80年代末是不可能实现的任务。现代“深度学习”革命就此拉开序幕。

这场热潮至今仍在持续。最初,对神经网络的训练多以实现单一特定任务为核心目标——下围棋或人脸识别。但2017年谷歌研究人员设计出名为转换器的特殊神经网络,它擅长解析语言序列,这一突破将人工智能的盛夏推向了更深层次。2019年,OpenAI的一项研究让这股热潮再获助力——他们发现,依托海量文本数据完成训练的转换器模型,不仅具备生成高质量文本的能力,还能掌握翻译、摘要等多种语言任务。三年后,基于该模型的神经网络升级版GPT-3.5,成为风靡全球的聊天机器人ChatGPT的核心引擎。

如今ChatGPT推出三年后,人工智能的炒作热度空前高涨。若以过往人工智能寒冬为参照,如今确实出现若干秋日征兆——随风飘落的零星落叶。这究竟是“又一场将让人工智能投资陷入长达一代人冰封期的极寒风暴”的前奏,还是“阳光重现前短暂的寒流”,唯有时间才能给出答案。(*)

译者:中慧言-王芳

夏日渐去、秋日将至,众多科技界人士开始担忧寒冬的到来。上月末,彭博社专栏作家提出疑问:“人工智能寒冬终于要来了吗?”英国《每日电讯报》则态度更为笃定:“下一轮人工智能寒冬即将来临”。与此同时,社交平台X上关于“人工智能寒冬或将来临”的讨论也甚嚣尘上。

“人工智能寒冬”是人工智能领域从业者用以指代特定时期的称谓:这一时期,大众对“机器能如人类般学习、思考”这一理念的热情渐趋冷却,对人工智能产品、企业及研究的投资也随之枯竭。这一词汇之所以频繁从人工智能评论员的口中说出,实则有其深刻缘由——在长达70年的人工智能研究历史中,我们已经历过数次“寒冬”。倘若如部分人所忧虑的那样,我们即将步入新一轮“寒冬”,那么这至少将是第四次。

近期关于寒冬将至的讨论,根源在于投资者愈发忧虑人工智能技术可能无法达成炒作营造出的预期,且诸多人工智能相关公司估值过高。在最糟糕的情况下,这场人工智能寒冬可能伴随着人工智能概念催生的股市泡沫的破裂,进而对整个经济产生影响。虽然此前也曾经历人工智能炒作周期,但从未有哪次像本轮生成式人工智能热潮这样,吸引投资者投入高达数千亿美元资金。因此,若新一轮“寒冬”到来,其冲击力或将如极地涡旋般猛烈。

近期OpenAI首席执行官萨姆·奥尔特曼(Sam Altman)的言论引发了市场恐慌。他向记者坦言部分风投支持的人工智能初创企业估值严重虚高(当然,OpenAI不在此列,它是史上估值最高的风投支持型初创企业之一)。随后,麻省理工学院发布的研究报告指出95%的人工智能试点项目以失败告终。

回顾过往的人工智能寒冬及其成因,或许能帮助我们判断当前空气中的“寒意”究竟只是一阵短暂的凉风,还是“冰河时代”即将来临的先兆。有时人工智能寒冬是由学术研究揭示特定人工智能技术的局限性引发的;有时则因人工智能技术在现实应用中屡屡受挫;有时两者兼而有之。但历次人工智能寒冬的共同之处在于:当承载厚望的新进展未能兑现炒作所赋予的过高期待时,出资方便会陷入幻灭。

第一轮人工智能炒作周期

冷战初期,美国及其盟国政府便在人工智能研究领域投入了巨额资金。彼时与当下情形一样,美国政府将这项技术视为可能带来战略和军事优势的领域,因此美国国防部提供了大部分人工智能研究经费。

当时,人工智能领域存在两种对立的方法论。其一,是借助硬编码逻辑规则,将输入数据分类为符号,再通过对这些符号进行操控来得出输出结果。依靠此方法,计算机在跳棋、国际象棋领域首次取得重大突破,世界上首批聊天机器人也由此诞生。

另一种方法则基于感知器技术——即当今神经网络的前身,是大致模仿大脑运行机制的人工智能。感知器并非从规则和逻辑出发,而是通过数据学习归纳完成特定任务的规则。美国海军研究办公室为感知器的早期研究提供了大量资金支持,而康奈尔大学神经科学家兼心理学家弗兰克·罗森布拉特(Frank Rosenblatt)是该技术的开创者。美国海军和中央情报局(CIA)均对感知器进行了测试,试图验证其能否对目标进行分类——例如识别敌舰轮廓,或辨别航空侦察照片中的潜在目标。

两大对立阵营都高调宣称,其技术将迅速催生出与人类智力相当甚至超越人类智力的计算机。1958年,罗森布拉特在接受《纽约时报》采访时表示,他研发的感知器很快就能识别人脸并喊出其姓名,距离实现即时语言翻译“仅一步之遥”,最终人工智能系统还将具备自我复制能力并拥有意识。与此同时,麻省理工学院人工智能实验室联合创始人、符号人工智能阵营领军人物马文·明斯基(Marvin Minsky)在1970年接受《生活》杂志采访时宣称:“未来三到八年内,我们将拥有具备普通人类通用智能的机器。”

这正是人工智能寒冬的首要前提:炒作。如今多位人工智能领域知名人士的言论与彼时存在明显的相似之处。今年1月,OpenAI首席执行官萨姆·奥尔特曼在其个人博客中写道:“我们如今笃定——已掌握构建具备传统意义上人类水平的通用人工智能的方法”,并表示OpenAI正日益将重心转向研发超越人类的“超级智能”。他还称,今年“我们可能见证首批人工智能代理'加入劳动力队伍',并切实改变企业的产出”。Anthropic联合创始人兼首席执行官达里奥·阿莫迪(Dario Amodei)曾预测,具备人类水平的人工智能将于2026年问世。与此同时,谷歌DeepMind联合创始人兼首席执行官戴密斯·哈萨比斯(Demis Hassabis)则表示,在所有认知领域均达到人类水平的人工智能将在未来“五到十年内”诞生。

政府失去信心

但引发人工智能寒冬的,是部分确凿证据表明炒作的愿景无法兑现。第一次寒冬的爆发源于一系列沉重打击:1966年,受美国国家研究委员会(National Research Council)委托的委员会发布了一份关于自然语言处理和机器翻译现状的负面报告,结论是计算机翻译比人工翻译成本更高、速度更慢且准确性更低。该委员会此前已为早期语言人工智能研究投入2000万美元(按如今币值计算,至少相当于2亿美元),随后便停止了所有资金支持。

随后在1969年,明斯基又挥出第二记重拳。这一年,他与人工智能研究者西蒙·派珀特(Seymour Papert)合著的专著对感知器进行了全面批判。在书中,明斯基与派珀特通过数学论证证明:单层感知器(如罗森布拉特1958年高调展示的那种)仅能进行精确的二元分类——换言之,它只能识别物体是黑是白、是圆是方,却无法将事物归入两个以上的类别。

事后证明,明斯基与派珀特的批判存在重大缺陷。尽管多数人将此书视为基于神经网络的人工智能永远无法企及人类智力水平的决定性证据,但他们的论证仅适用于结构简单的单层感知器:输入层由若干接收数据的神经元构成,且所有输入层神经元仅与一个输出层神经元相连。他们很可能刻意忽略了这样一个事实:早在1960年代,部分研究者已开始探索多层感知器——这种感知器在输入层神经元与输出层神经元之间增设了一个由神经元构成的中间“隐藏层”。作为当今“深度学习”技术的真正前身,多层感知器实际上具备将数据归入两个以上类别的能力。但当时训练这种多层神经网络难度极大。而这已无关紧要——损害已然造成。明斯基与派珀特的著作出版后,美国政府对基于神经网络的人工智能方法的资金支持基本终止。

明斯基与派珀特的批判不仅说服了美国国防部的资助机构,还让众多计算机科学家相信神经网络研究已走入死胡同。部分神经网络研究者甚至指责明斯基使该领域的发展倒退了数十年。2006年,曾助力重新点燃神经网络研究热情的研究员特伦斯·谢诺夫斯基(Terry Sjenowski)在一次会议上公开质问明斯基:“你是魔鬼吗?”明斯基无视提问,转而详细阐述他眼中神经网络存在的缺陷。谢诺夫斯基继续追问,恼怒的明斯基大声回应道:“没错,我就是!”

但明斯基代表的符号人工智能,很快也面临资金短缺的困境。同样是在1969年,美国国会强制要求曾为两种人工智能研究方法提供大量资金支持的美国国防部高级研究计划局(Defense Advanced Research Project Agency,DARPA)改变拨款方式。该机构被告知要资助那些具有明确军事应用场景的研究项目,而非更侧重理论探索的“蓝天研究”(指无明确实用目标、纯基础领域的研究)。尽管部分符号人工智能研究符合这一标准,但大多数研究并不符合。

1973年,致命一击降临:英国议会委托剑桥大学数学家詹姆斯·莱特希尔(James Lighthill)对英国人工智能研究现状展开调查。他在结论中指出,在实现与人类智力水平相当这一宏大目标上,人工智能未能显露出任何希望,其推崇的诸多算法虽能解决“玩具级问题”(指简单模拟场景中的问题),却永远无法应对现实世界的复杂性。基于莱特希尔的这一结论,英国政府终止了对人工智能研究的所有资金支持。

尽管莱特希尔的调查仅聚焦于英国的人工智能研究,但美国国防部高级研究计划局以及其他资助人工智能研究的美国机构均注意到了这一结论,这进一步加深了他们对人工智能的怀疑态度。到1974年,美国对人工智能项目的资助额仅为1960年代的零头。人工智能寒冬就此降临,并一直持续到20世纪80年代初。

如今,当研究表明人工智能未能达到预期时,也出现了与第一次人工智能寒冬相似的情形。苹果公司与亚利桑那州立大学近期发表的两篇研究论文,对前沿人工智能模型是否真正具备推理能力提出质疑——这些模型本应通过“思维链”推理如何回应提示词。两篇论文均得出一致结论:这些模型并未像人类理解的推理那样,学习如何将可泛化的逻辑规则和问题解决技巧用于解决新问题,而仅仅是试图将当前问题与训练数据中出现过的问题进行匹配。这些研究或许会成为当代版“明斯基与派珀特批判感知器”的标志性事件。

与此同时,关于当前人工智能模型实际影响的研究正日益增多,这类研究与莱特希尔报告及美国国家研究委员会的报告类似。例如,麻省理工学院的一项研究得出结论,95%的人工智能试点项目未能推动企业营收增长。赛富时(Salesforce)研究人员近期发布的研究发现,当前多数大型语言模型无法准确执行客户关系管理(CRM)任务——这一结论颇具讽刺意味,因为赛富时自身正大力推广人工智能代理,以实现客户关系管理流程自动化。Anthropic的研究表明,其Claude模型无法成功运营自动售货机业务——相较于科技鼓吹者宣称将被人工智能代理“彻底颠覆”的众多业务,这已是相对简单的业务。人工智能研究机构METR的研究还揭示:实际上,相较于不借助人工智能编程助手的情况,使用这类工具的软件开发人员,完成任务的速度降低19%。

但存在部分关键差异。最显著的是,当前的人工智能热潮并不依赖公共资金。尽管包括美国军方在内的政府机构正成为人工智能企业的重要客户,但推动当前热潮的资金几乎完全来自私营领域。自2022年11月ChatGPT推出以来,风险投资机构已向人工智能初创企业投入至少2500亿美元。这还不包括微软、谷歌母公司Alphabet、亚马逊和Meta等大型上市科技公司在自身人工智能项目上的巨额投入。仅今年一年,用于建设人工智能数据中心的支出就高达3500亿美元,预计明年这一数字还会进一步攀升。

此外,与第一次人工智能寒冬时期人工智能系统主要停留在研究实验阶段不同,如今人工智能已在各行业广泛部署。人工智能还成为一项规模庞大的消费技术——仅ChatGPT的周用户量就达7亿——这在以往是从未有过的情况。尽管当今的人工智能似乎仍缺乏人类智能的某些关键要素,但相较于过去的人工智能系统已有显著进步,而且人们确实发现这项技术在大量任务中具有实用价值,这一点毋庸置疑。

第二次人工智能寒冬:企业失去耐心

第一次人工智能寒冬在20世纪80年代初逐渐消退,这主要归功于计算能力的提升和算法技术的改进。这一时期,人工智能领域的炒作主要集中在“专家系统”上——这类计算机程序旨在将特定领域人类专家的知识编码为逻辑规则集,软件根据这些规则执行特定任务。

尽管如此,企业界仍热情高涨,认为专家系统将推动生产力大幅提升。在这轮人工智能炒作周期的鼎盛阶段,近三分之二的《财富》美国500强企业宣称已部署专家系统。到1985年,美国企业在这方面的总投入已超过10亿美元,围绕该技术的完整产业也应运而生,其中大部分得到了风险投资的支持。大部分资金用于研发名为LISP机的专用计算机硬件,这些硬件经过优化可运行专家系统——其中许多系统正是用LISP编程语言编写的。此外,自1983年起,美国国防高级研究计划局通过新推出的“战略计算计划”重新资助人工智能研究,最终向全美多所大学的90余个人工智能项目投入逾1亿美元资金。

尽管专家系统借鉴了符号人工智能研究者开创的诸多方法,但许多计算机科学领域的学者担忧,过高的期望值将再次引发“繁荣-萧条”周期,进而对该领域的发展造成损害。明斯基和人工智能研究学者罗杰·尚克(Roger Schank)在1984年的一场人工智能会议上创造了“人工智能寒冬”这一术语。他们选用这个新词,意在呼应“核冬天”——大规模核战争后可能出现的、不见天日的毁灭性萧条时期。

随后发生的三件事引发了新一轮寒冬。1987年,太阳计算机系统公司(Sun Microsystems)推出新型计算机工作站。这类工作站,以及IBM和苹果推出的性能日益强大的台式机,使得专用LISP机变得不再必要。不到一年时间,LISP机市场便彻底崩塌。许多风险投资机构血本无归,从此对人工智能初创企业避之不及。同年,纽约大学计算机科学家杰克·施瓦茨(Jack Schwartz)出任美国国防部高级研究计划局计算研究部门负责人。他向来对人工智能持否定态度,尤其反对专家系统,随即大幅削减相关经费。

与此同时,企业逐渐发现专家系统的构建与维护成本高昂且难度极大。这类系统还存在“脆弱性”——虽能高效处理高度常规化任务,但遇到稍有异常的情况,就难以应用预设的逻辑规则。此时,系统往往会输出怪异且不准确的结果,甚至直接彻底崩溃。事实证明,要制定出能覆盖所有极端情况的规则,是一项不可能完成的任务。因此到20世纪90年代初,企业开始放弃专家系统。与首次人工智能热潮中科学家和政府资助方对技术产生质疑不同,第二次寒冬的主要推手是企业的失望情绪。

如今人工智能领域的发展,与彼时存在明显的相似之处。例如,微软、Alphabet、亚马逊云科技、埃隆·马斯克的X.ai以及Meta正斥资数千亿美元建设人工智能数据中心。OpenAI正与软银、甲骨文及其他投资者共同推进耗资5000亿美元的“星门计划”数据中心项目。英伟达之所以能凭借4.3万亿美元市值成为全球市值最高的公司,很大程度上是因为其生产的人工智能芯片满足了数据中心的需求。数据中心热潮背后的核心假设之一是:最前沿的人工智能模型,其规模即便不比现有顶尖模型更大,至少也会与之相当。而训练和运行这类规模的模型,需要极其庞大的数据中心支持。

然而与此同时,多家初创企业已找到巧妙方法,成功研发出规模小得多却能模拟大型模型诸多功能的模型,且所需计算资源远少于后者,有些甚至无需使用英伟达生产的专用人工智能芯片,规模小到可在智能手机上运行。若这一趋势持续下去,那些巨型数据中心可能会变得不再必要——就像当年LISP机被证明并非必需品一样。这意味着,投入人工智能基础设施的数千亿美元资金,最终可能沦为“搁浅资产”。

当今的人工智能系统在诸多方面比20世纪80年代的专家系统更强大、更灵活。但企业仍发现其部署过程复杂且成本高昂,投资回报往往难以捉摸。尽管当下的人工智能模型比专家系统更具通用性与韧性,但依旧不可靠,尤其是在处理训练数据未充分覆盖的特殊案例时。它们容易产生幻觉,会笃定地输出错误信息,有时甚至会犯人类绝不会犯的错误。这意味着企业和政府无法将人工智能用于关键任务流程自动化。企业是否会像当年对专家系统那样,对生成式人工智能和大型语言模型失去耐心,目前尚难预料,但这种情况确实存在发生的可能性。

第三次人工智能寒冬:神经网络的兴衰与复兴

20世纪80年代,另一种人工智能方法——神经网络也重新引发关注,这在一定程度上得益于大卫·莱姆哈特(David Rumelhart)、杰弗里·辛顿(Geoffrey Hinton)和罗纳德·威廉姆斯(Ronald Williams)的研究。1986年,他们成功找到了破解自20世纪60年代以来便一直困扰多层感知器的关键难题的方法。他们的创新成果被称为反向传播(backpropagation,简称backprop),这种方法能在每次训练过程中对中间“隐藏层”神经元的输出结果进行修正,从而让整个神经网络实现高效学习。

反向传播算法,再加上性能更强大的计算机,共同推动了神经网络的复兴。很快,研究人员构建的多层神经网络便具备多种能力:能识别信封和支票上的手写字母、分析家谱中人物的亲属关系、识别打印字符并通过语音合成器朗读,甚至能为早期自动驾驶汽车导航,使其保持在高速公路车道内行驶。

这在20世纪80年代末引发了短暂的神经网络热潮。但神经网络也存在显著缺陷:训练过程需要海量数据,而许多任务根本无法获取所需的海量数据;在当时的计算机硬件条件下,训练速度极慢,有时运行过程中会出现迟滞。

这意味着神经网络仍存在大量无法完成的任务。与当初企业争先恐后地采用专家系统不同,如今企业并未急于引入神经网络——因其应用场景似乎极为受限。与此同时,其他统计机器学习技术正快速取得进展,这些技术所需数据量更少、对计算能力要求更低。如此一来,许多人工智能研究者和工程师再次对神经网络失去信心,又一个长达十年的人工智能寒冬来临。

推动第三次寒冬回暖,有两大因素发挥作用:其一,互联网产生了海量数字数据,且获取这些数据变得相对轻松,这解决了20世纪80年代神经网络发展面临的数据瓶颈问题;其二,自2004年起,先是马里兰大学的研究者,随后是微软的研究者,开始尝试使用“专为电子游戏设计的新型计算机芯片”——图形处理器(GPU)——来训练和运行神经网络。图形处理器具备并行执行大量相同运算的能力,而这恰恰契合了神经网络的运算需求。很快,杰弗里·辛顿及其研究生证明:基于海量数据集训练的、在图形处理器上运行的神经网络,能够完成诸如将图像分类为上千种类别等任务——这在20世纪80年代末是不可能实现的任务。现代“深度学习”革命就此拉开序幕。

这场热潮至今仍在持续。最初,对神经网络的训练多以实现单一特定任务为核心目标——下围棋或人脸识别。但2017年谷歌研究人员设计出名为转换器的特殊神经网络,它擅长解析语言序列,这一突破将人工智能的盛夏推向了更深层次。2019年,OpenAI的一项研究让这股热潮再获助力——他们发现,依托海量文本数据完成训练的转换器模型,不仅具备生成高质量文本的能力,还能掌握翻译、摘要等多种语言任务。三年后,基于该模型的神经网络升级版GPT-3.5,成为风靡全球的聊天机器人ChatGPT的核心引擎。

如今ChatGPT推出三年后,人工智能的炒作热度空前高涨。若以过往人工智能寒冬为参照,如今确实出现若干秋日征兆——随风飘落的零星落叶。这究竟是“又一场将让人工智能投资陷入长达一代人冰封期的极寒风暴”的前奏,还是“阳光重现前短暂的寒流”,唯有时间才能给出答案。(*)

译者:中慧言-王芳

As summer fades into fall, many in the tech world are worried about winter. Late last month, a Bloomberg columnist asked “is the AI winter finally upon us?” British newspaper The Telegraph was more definitive. “The next AI winter is coming,” it declared. Meanwhile, social media platform X was filled with chatter about a possible AI winter.

An “AI winter” is what folks in artificial intelligence call a period in which enthusiasm for the idea of machines that can learn and think like people wanes—and investment for AI products, companies, and research dries up. There’s a reason this phrase comes so naturally to the lips of AI pundits: We’ve already lived through several AI winters over the 70-year history of artificial intelligence as a research field. If we’re about to enter another one, as some suspect, it’ll be at least the fourth.

The most recent talk of a looming winter has been triggered by growing concerns among investors that AI technology may not live up to the hype surrounding it—and that the valuations of many AI-related companies are far too highl. In a worst case scenario, this AI winter could be accompanied by the popping of an AI-inflated stock market bubble, with reverberations across the entire economy. While there have been AI hype cycles before, they’ve never involved anything close to the multiple hundreds of billions of dollars that investors have sunk into the generative AI boom. And so if there is another AI winter, it could involve polar vortex levels of pain.

The markets have been spooked recently by comments from OpenAI CEO Sam Altman, who told reporters he thought some venture-backed AI startups were grossly overvalued (although not OpenAI, of course, which is one of the most highly-valued venture-backed startups of all time). Hot on the heels of Altman’s remarks came a study from MIT that concluded that 95% of AI pilot projects fail.

A look at past AI winters, and what caused them, may give us some indication of whether that chill in the air is just a passing breeze or the first hints of an impending Ice Age. Sometimes those AI winters have been brought on by academic research highlighting the limitations of particular AI techniques. Sometimes they have been caused by frustrations getting AI tech to work well in real world applications. Sometimes both factors have been at play. But what previous AI winters all had in common was disillusionment among those footing the bill after promising new advances failed to deliver on the ensuing hype.

The first AI hype cycle

The U.S. and allied governments lavishly funded artificial intelligence research throughout the early days of the Cold War. Then, as now, Washington saw the technology as potentially conferring a strategic and military advantage, and much of the funding for AI research came from the Pentagon.

During this period, there were two competing approaches to AI. One was based on hard-coding logical rules for categorizing inputs into symbols and then for manipulating those symbols to arrive at outputs. This was the method that yielded the first great leaps forward in computers that could play checkers and chess, and also led to the world’s first chatbots.

The rival AI method was based on something called a perceptron, which was the forerunner of today’s neural networks, a kind of AI loosely built on a caricature of how the brain works. Rather than starting with rules and logic, a perceptron learned a rule for accomplishing some task from data. The U.S. Office of Naval Research funded much of the early work on perceptrons, which were pioneered by Cornell University neuroscientist and psychologist Frank Rosenblatt. Both the Navy and the CIA tested perceptrons to see if they could classify things like the silhouettes of enemy ships or potential targets in aerial reconnaissance photos.

The two competing camps both made hyperbolic claims that their technology would soon deliver computers that equalled or exceeded human intelligence. Rosenblatt told The New York Times in 1958 that his perceptrons would soon be able to recognize individuals and call out their names, that it was “only one more step of development” before they could instantly translate languages, and that eventually the AI systems would self-replicate and become conscious. Meanwhile Marvin Minsky, cofounder of MIT’s AI Lab and a leading figure in the symbolic AI camp, told Life magazine in 1970 that “in three to eight years we will have a machine with the general intelligence of an average human being.”

That’s the first prerequisite for an AI winter: hype. And there are clear parallels today in statements made by a number of prominent AI figures. Back in January, OpenAI CEO Sam Altman wrote on his personal blog that “we are now confident we know how to build [human-level artificial general intelligence] as we have traditionally understood it” and that OpenAI was turning increasingly towards building super-human “superintelligence.” He wrote that this year “we may see the first AI agents ‘join the workforce’ and materially change the output of companies.” Dario Amodei, the cofounder and CEO of Anthropic, has said the human-level AI could arrive in 2026. Meanwhile, Demis Hassabis, the cofounder and CEO of Google DeepMind, has said that AI matching humans across all cognitive domains would arrive in the next “five to 10 years.”

Government loses faith

But what precipitates an AI winter is some definitive evidence this hype cannot be met. For the first AI winter, that evidence came in a succession of blows. In 1966, a committee commissioned by the National Research Council issued a damning report on the state of natural language processing and machine translation. It concluded that computer-based translation was more expensive, slower and less accurate than human translation. The research council, which had provided $20 million towards this early kind of language AI (at least $200 million in today’s dollars), cut off all funding.

Then, in 1969, Minsky was responsible for a second punch. That year, he and Seymour Papert, a fellow AI researcher, published a book-length takedown of perceptrons. In the book, Minsky and Papert proved mathematically that a single layer perceptron, like the kind Rosenblatt had shown off to great fanfare in 1958, could only ever make accurate binary classifications—in other words, it could identify if something were black or white, or a circle or a square. But it could not categorize things into more than two buckets.

It turned out there was a big problem with Minsky’s and Papert’s critique. While most interpreted the book as definitive proof that neural network-based AI would never come close to human-level intelligence, their proofs applied only to a simple perceptron that had just a single layer: an input layer consisting of several neurons that took in data, all linked to a single output neuron. They had ignored, likely deliberately, that some researchers in the 1960s had already begun experimenting with multilayer perceptrons, which had a middle “hidden” layer of neurons that sat between the input neurons and output neuron. True forerunners of today’s “deep learning,” these multilayer perceptrons could, in fact, classify data into more than two categories. But at the time, training such a multilayer neural network was fiendishly difficult. And it didn’t matter. The damage was done. After the publication of Minsky’s and Papert’s book, U.S. government funding for neural network-based approaches to AI largely ended.

Minsky’s and Papert’s attack didn’t just persuade Pentagon funding bodies. It also convinced many computer scientists too that neural networks were a dead end. Some neural network researchers came to blame Minsky for setting back the field by decades. In 2006, Terry Sjenowski, a researcher who helped revive interest in neural networks, stood up at a conference and confronted Minsky, asking him if he were the devil. Minsky ignored the question and began detailing what he saw as the failings of neural networks. Sjenowski persisted in asking Minsky again if he were the devil. Eventually an angry Minsky shouted back: “Yes, I am!”

But Minsky’s symbolic AI soon faced a funding drought too. Also in 1969, Congress forced the Defense Advanced Research Project Agency (DARPA), which had been a major funder of both AI approaches, to change its approach to issuing grants. The agency was told to fund research that had clear, applied military applications, instead of more blue-sky research. And while some symbolic AI research fit this rubric, a lot of it did not.

The final punch came in 1973, when the U.K. parliament commissioned Cambridge University mathematician James Lighthill to investigate the state of AI research in Britain. His conclusion was that AI had failed to show any promise of fulfilling its grand claims of equaling human intelligence and that many of its favored algorithms, while they might work for toy problems, could never deal with the real world’s complexity. Based on Lighthill’s conclusions, the U.K. government curtailed all funding for A.I. research.

Lighthill had only looked at U.K. AI efforts, but DARPA and other U.S. funders of AI research took note of its conclusions, which reinforced their own growing skepticism of AI. By 1974, U.S. funding for AI projects was a fraction of what it had been in the 1960s. Winter had set in—and it would last until the early 1980s.

Today, too, there are parallels with this first AI winter when it comes to studies suggesting AI isn’t meeting expectations. Two recent research papers from researchers at Apple and Arizona State University have cast doubt on whether the cutting edge AI models, which are supposed to use a “chain of thought” to reason about how to answer a prompt, are actually engaging in reasoning at all. Both papers conclude that rather than learning to apply generalizable logical rules and problem-solving techniques to new problems—which is what humans would consider reasoning—the models simply try to match a problem to one seen in its training data. These studies could turn out to be the equivalent of Minsky’s and Papert’s attack on perceptrons.

Meanwhile, there are also a growing number of studies on the real world impact of today’s AI models that parallel the Lighthill and NRC reports. For instance, there’s that MIT study which concluded 95% of AI pilots are failing to boost corporate revenues. There’s a recent study from researchers at Salesforce that concluded most of today’s large language models (LLMs) cannot accurately perform customer relation management (CRM) tasks—a particularly ironic conclusion since Salesforce itself has been pushing AI agents to automate CRM processes. Anthropic research showed that its Claude model could not successfully run a vending machine business—a relatively simple business compared to many of those that tech boosters say are poised to be “utterly transformed” by the AI agents. There’s also a study from the AI research group METR that showed software developers using an AI coding assistant were actually 19% slower at completing tasks than they were without it.

But there are some key differences. Most significantly, today’s AI boom is not dependent on public funding. Although government entities, including the U.S. military, are becoming important customers for AI companies, the money fueling the current boom is almost entirely private. Venture capitalists have invested at least $250 billion into AI startups since ChatGPT debuted in November 2022. And that doesn’t include the vast amount being spent by large, publicly-traded tech companies like Microsoft, Alphabet, Amazon, and Meta on their own AI efforts. An estimated $350 billion is being spent to build out AI data centers this year alone, with even more expected next year.

What’s more, unlike in that first AI winter, when AI systems were mostly just research experiments, today AI is being widely deployed across businesses. AI has also become a massive consumer technology—ChatGPT alone is thought to have 700 million weekly users—which was never the case previously. While today’s AI still seems to lack some key aspects of human intelligence, it is a lot better than systems that existed previously and it is hard to argue that people are not finding the technology useful for a good number of tasks.

Winter No. 2: Business loses patience

That first AI winter thawed in the early 1980s thanks largely to increases in computing power and some improved algorithmic techniques. This time, much of the hype in AI was around “expert systems”. These were computer programs that were designed to encode the knowledge of human experts in a particular domain into a set of logical rules which the software would then apply to accomplish some specific task.

Nevertheless, business was enthusiastic, believing expert systems would lead to a productivity boom. At the height of this AI hype cycle, nearly two-thirds of the Fortune 500 said they had deployed expert systems. By 1985, U.S. corporations were collectively spending more than $1 billion on expert systems and an entire industry, much of it backed by venture capital, sprouted up around the technology. Much of it was focused on building specialized computer hardware, called LISP machines, that were optimized to run expert systems, many of which were coded in the programming language LISP. What’s more, starting in 1983, DARPA returned to funding AI research through the new Strategic Computing Initiative, eventually offering over $100 million to more than 90 different AI projects at universities throughout the U.S.

Although expert systems drew on many of the methods symbolic AI researchers pioneered, many academic computer scientists were wary that inflated expectations would once again precipitate a boom and bust cycle that would hurt the field. Among them were Minsky and fellow AI researcher Roger Schank who coined the term “AI winter” during an AI conference in 1984. The pair chose the neologism to echo the term “nuclear winter”—the devastating and bleak period without sunlight that would likely follow a major nuclear war.

Three things then happened to bring about the next winter. In 1987, a new kind of computer workstation debuted from Sun Microsystems. These workstations, as well as increasingly powerful desktop computers from IBM and Apple, obviated the need for specialized LISP machines. Within a year, the market for LISP machines evaporated. Many venture capitalists lost their shirts—and became wary of ever backing AI-related startups again. That same year, New York University computer scientist Jack Schwartz became head of DARPA’s computing research. He was no fan of AI in general or expert systems in particular, and slashed funding for both.

Meanwhile, businesses gradually discovered that expert systems were difficult and expensive to build and maintain. They were also “brittle”—while they could handle highly routinized tasks well, when they encountered slightly unusual cases, they struggled to apply the logical rules they had been given. In such cases, they often produced bizarre and inaccurate outputs, or simply broke down completely. Delineating rules that would apply to every edge case proved an impossible task. As a result, by the early 1990s, companies were starting to abandon expert systems. Unlike in the first AI boom, where scientists and government funders came to question the technology, this second winter was mostly driven much more by business frustration.

Again there are some clear echoes in what’s happening with AI today. For instance, hundreds of billions of dollars are being invested in AI datacenters being constructed by Microsoft, Alphabet, Amazon’s AWS, Elon Musk’s X.ai, and Meta. OpenAI is working on its $500 billion Project Stargate data center plan with Softbank, Oracle and other investors. Nvidia has become the world’s most valuable company with a $4.3 trillion market cap largely by catering to this demand for AI chips for data centers. One of the big suppositions behind the big data center boom is that the most cutting edge AI models will be at least as large, if not larger, than the leading models that exist today. Training and running models of this size requires extremely large data centers.

But, at the same time, a number of startups have found clever ways to create much smaller models that mimic many of the capabilities of the giant models. These smaller models require far less computing resources—and in some cases don’t even require the kinds of specialized AI chips that Nvidia makes. Some might be small enough to run on a smart phone. If this trend continues, it is possible that those massive data centers won’t be required—just as it turned out LISP machines weren’t necessary. That could mean that hundreds of billions of dollars in AI infrastructure investment winds up stranded.

Today’s AI systems are in many ways more capable—and flexible—than the expert systems of the 1980s. But businesses are still finding them complicated and expensive to deploy and their return on investment too often elusive. While more general purpose and less brittle than the expert systems were, today’s AI models remain unreliable, especially when it comes to addressing unusual cases that might not have been well-represented in their training data. They are prone to hallucinations, confidently spewing inaccurate information, and can sometimes make mistakes no human ever would. This means companies and governments cannot use AI to automate mission critical processes. Whether this means companies will lose patience with generative AI and large language models, just as they did with expert systems, remains to be seen. But it could happen.

Winter No. 3: The rise and fall (and rise) of neural networks

The 1980s also saw renewed interest in the other AI method, neural networks, due in part to the work of David Rumelhart, Geoffrey Hinton and Ronald Williams, who in 1986 figured out a way to overcome a key challenge that had bedeviled multilayered perceptrons since the 1960s. Their innovation was something called backpropagation, or backprop for short, which was a method for correcting the outputs of the middle, hidden layer of neurons during each training pass so that the network as a whole could learn efficiently.

Backprop, along with more powerful computers, helped spur a renaissance in neural networks. Soon researchers were building multilayered neural networks that could decipher handwritten letters on envelopes and checks, learn the relationships between people in a family tree, recognize typed characters and read them aloud through a voice synthesizer, and even steer an early self-driving car, keeping it between the lanes of a highway.

This led to a short-lived boom in neural networks in the late 1980s. But neural networks had some big drawbacks too. Training them required a lot of data, and for many tasks, the amount of data required just didn’t exist. They also were extremely slow to train and sometimes slow to run on the computer hardware that existed at the time.

This meant that there were many things neural networks could still not do. Businesses did not rush to adopt neural networks as they had expert systems because their uses seemed highly circumscribed. Meanwhile, there were other statistical machine learning techniques that used less data and required less computing power that seemed to be making rapid progress. Once again, many AI researchers and engineers wrote off neural networks. Another decade-long AI winter set in.

Two things thawed this third winter: the internet created vast amounts of digital data and made accessing it relatively easy. This helped break the data bottleneck that had held neural networks back in the 1980s. Then, starting in 2004, researchers at the University of Maryland and then Microsoft began experimenting with using a new kind of computer chip that had been invented for video games, called a graphics processing unit, to train and run neural networks. GPUs could perform many of the same operations in parallel, which is what neural networks required. Soon, Geoffrey Hinton and his graduate students began demonstrating that neural networks, trained on large datasets and run on GPUs, could do things—like classify images into a thousand different categories—that would have been impossible in the late 1980s. The modern “deep learning” revolution was taking off.

That boom has largely continued through today. At first, neural networks were largely trained to do one particular task well—to play Go, or to recognize faces. But the AI summer deepened in 2017, when researchers at Google designed a particular kind of neural network called a Transformer that was good at figuring out language sequences. It was given another boost in 2019 when OpenAI figured out that Transformers trained on large amounts of text could not only write text well, but master many other language tasks, from translation to summarization. Three years later, an updated version of OpenAI’s transformer-based neural network, GPT-3.5, would be used to power the viral chatbot ChatGPT.

Now, three years after ChatGPT’s debut, the hype around AI has never been greater. There are certainly a few autumnal signs, a falling leaf carried on the breeze here and there, if past AI winters are any guide. But only time will tell if it is the prelude to another Arctic bomb that will freeze AI investment for a generation, or merely a momentary cold-snap before the sun appears again.

*