首页 / 财富中文网 / 正文

“按需思考”的GPT-5引发争议,但这可能是AI的未来

财富中文网 2025-08-19 04:39:51

“按需思考”的GPT-5引发争议,但这可能是AI的未来
图片来源:Chris Jung/NurPhoto via Getty Images

OpenAI上周发布的GPT-5本应是一场胜利,证明该公司仍是AI领域无可争议的领导者,然而结果却事与愿违。上周末,用户的强烈反对使此次发布不仅演变成公关危机,更升级为产品与信任危机。用户痛惜失去他们最喜欢的、可兼任心理医生、朋友和伴侣的模型,开发者则抱怨模型的性能下降。行业评论家盖瑞·马库斯照例批评GPT-5“姗姗来迟、过度炒作、索然无味”。

许多人指出问题的根源是显而易见的:全新的实时模型“路由器”会为每项任务自动调度GPT-5的某个子版本。许多用户原以为GPT-5是从零训练的单体模型;实际上,它却是多个模型的组合网络,有些模型性能较弱、成本更低,有些模型能力更强但成本更高。专家表示,随着大语言模型的发展且日益消耗资源,这种架构可能代表了AI的未来方向。但在GPT-5的首秀中,OpenAI暴露出该架构存在的一些固有挑战,也深刻认识到AI时代用户期望的演变趋势。

尽管模型路由技术有众多优势,但广大GPT-5用户仍对其剥夺控制权感到愤怒。有人甚至质疑OpenAI可能试图故意蒙蔽用户。

为平息风波,OpenAI迅速为专业用户重新启用早期主力模型GPT-4o,同时宣布修复路由故障、提高使用限额,并承诺持续更新以重建用户信任与系统稳定性。

对于这种情况,AI销售平台FirstQuadrant联合创始人阿南德·乔杜里直言不讳地评价道:“当路由精准时,它像魔法一样神奇,但当它失灵时,却如同系统崩溃一般。”

模型路由技术的未来前景与不一致性

伊利诺伊大学厄巴纳-香槟分校(University of Illinois Urbana-Champaign)计算机科学助理教授游家轩(音译)向《财富》透露,其实验室深入研究了模型路由技术的未来前景与不一致性。他表示,就GPT-5而言,他相信(虽并未证实)模型路由器有时可能将同一查询的不同部分分发至多个模型:更廉价快速的模型给出一种答案,而响应速度较慢、专注于推理的模型产生另一结果,当系统拼接不同模型的回应时会出现细微的矛盾之处。

他解释道,模型路由的构想虽然直观,“但真正让它发挥作用却并不容易”。他补充道,完善路由系统的难度堪比打造亚马逊(Amazon)级别的推荐系统,需要耗费数年,并且与众多领域专家协作。他解释道:“构建GPT-5模型投入的资源本应呈指数级增长。即便路由器选择小型模型,也不该产生不一致的答案。”

不过游家轩坚信路由技术将成常态。他表示:“业内同样认可模型路由技术的前景。”他指出这源于技术与经济的双重考量。在技术层面,单体模型性能似乎触及瓶颈。他提到了广受认可的扩展定律,即数据与算力增长可提升模型性能。他表示:“但众所周知,模型改进存在极限。过去一年我们亲眼见证单体模型的能力趋于饱和。”

在经济层面,路由技术使AI供应商能够重复使用旧模型,而不是在新模型发布后将其弃用。时事类查询需频繁更新,但静态事实在多年之后依旧准确。将特定查询导向旧模型,可避免浪费先前为训练模型投入的大量时间、算力和资金。

物理限制同样关键。GPU内存已成为训练更大模型的瓶颈,而芯片技术正逼近单晶片可承载的存储极限。游家轩解释称,物理限制意味着新模型的规模无法扩大十倍。

重获关注的旧理念

AI平台Lightning AI创始人兼CEO威廉·法尔肯指出,模型集成并非新概念,而是在2018年左右就已出现,由于OpenAI模型属黑箱系统,我们无法得知GPT-4是否也采用了模型路由技术。

他表示:“或许他们现在更明确地公开了这一点。”无论如何,GPT-5的发布被过度炒作——包括其模型路由系统。介绍该模型的官方博文宣称这是“迄今为止最智能、最快速、最实用的内置思维模型”。OpenAI在ChatGPT的官方博客中证实,GPT-5通过后台路由器协调多模型运行,必要时切换至深度推理模式。GPT-5系统文档更进一步列明多个变体:标准版gpt-5-main、高速版gpt-5-main-mini、思维版gpt-5-thinking、精简思维版gpt-5-thinking-mini及专业思考版,并阐述统一系统如何自动调度。

在媒体预发布会上,OpenAI CEO萨姆·奥尔特曼将模型路由器誉为解决“模型选择难题”的方案。他表示旧版模型选择界面是“一团糟,令人迷惑”。

但法尔肯认为,核心问题在于GPT-5未带来跨越式提升。“从GPT-1到2、3、4,每次迭代都有巨大飞跃。而第四代到第五代的改进微乎其微,这才是用户不满的根源。”

多模型叠加能否实现AGI?

关于模型路由的争议引发部分人士批评当前对通用人工智能(AGI)即将实现的过度炒作。OpenAI官方将AGI定义为“在大多数具有经济价值的工作中超越人类的高度自主系统”,但奥尔特曼上周特别强调该术语“实用性不足”。

TensorOpera联合创始人、AI研究员何朝阳在X平台发文批评GPT-5的发布称:“承诺的AGI在哪里?强大如OpenAI这样的公司也无力训练超大模型,被迫采用实时模型路由器。”

AI生产平台Anyscale的联合创始人罗伯特·西哈拉表示,AI领域仍在持续扩展,但全能型单体模型仍遥不可及。他表示:“很难打造出样样精通的全能模型。”这正是GPT-5依赖路由连接的模型网络而非单体架构的原因。

OpenAI曾表示希望未来整合为单一模型,但西哈拉强调混合系统具备实质优势:你可以逐步升级系统中的某个部分,不会影响其他部分的运行;这样既能获得大部分性能提升,又能避免重新训练整个庞大模型所带来的高昂成本和复杂性。因此他认为路由技术将长期存在。

何朝阳对此表示认同。理论上扩展定律依然成立,即更多数据与算力能提升模型性能,但在实际操作中,他认为AI的发展会在两种路径之间“螺旋式推进”:一方面是将多个专用模型通过路由机制组合使用,另一方面则是尝试将它们整合成一个统一的大模型。决定因素在于工程成本、算力与能源限制,以及商业压力。

对AGI的过度炒作也需要调整。法尔肯在谈及大语言模型的“大脑”时表示:“如果真有人做出接近AGI的东西,我不确定那是否会由一组权重参数来实现。如果是一组模型组合起来,整体看起来像是AGI,那也没问题。我们在这方面不要拘泥于纯粹主义。”(*)

译者:刘进龙

审校:汪皓

OpenAI上周发布的GPT-5本应是一场胜利,证明该公司仍是AI领域无可争议的领导者,然而结果却事与愿违。上周末,用户的强烈反对使此次发布不仅演变成公关危机,更升级为产品与信任危机。用户痛惜失去他们最喜欢的、可兼任心理医生、朋友和伴侣的模型,开发者则抱怨模型的性能下降。行业评论家盖瑞·马库斯照例批评GPT-5“姗姗来迟、过度炒作、索然无味”。

许多人指出问题的根源是显而易见的:全新的实时模型“路由器”会为每项任务自动调度GPT-5的某个子版本。许多用户原以为GPT-5是从零训练的单体模型;实际上,它却是多个模型的组合网络,有些模型性能较弱、成本更低,有些模型能力更强但成本更高。专家表示,随着大语言模型的发展且日益消耗资源,这种架构可能代表了AI的未来方向。但在GPT-5的首秀中,OpenAI暴露出该架构存在的一些固有挑战,也深刻认识到AI时代用户期望的演变趋势。

尽管模型路由技术有众多优势,但广大GPT-5用户仍对其剥夺控制权感到愤怒。有人甚至质疑OpenAI可能试图故意蒙蔽用户。

为平息风波,OpenAI迅速为专业用户重新启用早期主力模型GPT-4o,同时宣布修复路由故障、提高使用限额,并承诺持续更新以重建用户信任与系统稳定性。

对于这种情况,AI销售平台FirstQuadrant联合创始人阿南德·乔杜里直言不讳地评价道:“当路由精准时,它像魔法一样神奇,但当它失灵时,却如同系统崩溃一般。”

模型路由技术的未来前景与不一致性

伊利诺伊大学厄巴纳-香槟分校(University of Illinois Urbana-Champaign)计算机科学助理教授游家轩(音译)向《财富》透露,其实验室深入研究了模型路由技术的未来前景与不一致性。他表示,就GPT-5而言,他相信(虽并未证实)模型路由器有时可能将同一查询的不同部分分发至多个模型:更廉价快速的模型给出一种答案,而响应速度较慢、专注于推理的模型产生另一结果,当系统拼接不同模型的回应时会出现细微的矛盾之处。

他解释道,模型路由的构想虽然直观,“但真正让它发挥作用却并不容易”。他补充道,完善路由系统的难度堪比打造亚马逊(Amazon)级别的推荐系统,需要耗费数年,并且与众多领域专家协作。他解释道:“构建GPT-5模型投入的资源本应呈指数级增长。即便路由器选择小型模型,也不该产生不一致的答案。”

不过游家轩坚信路由技术将成常态。他表示:“业内同样认可模型路由技术的前景。”他指出这源于技术与经济的双重考量。在技术层面,单体模型性能似乎触及瓶颈。他提到了广受认可的扩展定律,即数据与算力增长可提升模型性能。他表示:“但众所周知,模型改进存在极限。过去一年我们亲眼见证单体模型的能力趋于饱和。”

在经济层面,路由技术使AI供应商能够重复使用旧模型,而不是在新模型发布后将其弃用。时事类查询需频繁更新,但静态事实在多年之后依旧准确。将特定查询导向旧模型,可避免浪费先前为训练模型投入的大量时间、算力和资金。

物理限制同样关键。GPU内存已成为训练更大模型的瓶颈,而芯片技术正逼近单晶片可承载的存储极限。游家轩解释称,物理限制意味着新模型的规模无法扩大十倍。

重获关注的旧理念

AI平台Lightning AI创始人兼CEO威廉·法尔肯指出,模型集成并非新概念,而是在2018年左右就已出现,由于OpenAI模型属黑箱系统,我们无法得知GPT-4是否也采用了模型路由技术。

他表示:“或许他们现在更明确地公开了这一点。”无论如何,GPT-5的发布被过度炒作——包括其模型路由系统。介绍该模型的官方博文宣称这是“迄今为止最智能、最快速、最实用的内置思维模型”。OpenAI在ChatGPT的官方博客中证实,GPT-5通过后台路由器协调多模型运行,必要时切换至深度推理模式。GPT-5系统文档更进一步列明多个变体:标准版gpt-5-main、高速版gpt-5-main-mini、思维版gpt-5-thinking、精简思维版gpt-5-thinking-mini及专业思考版,并阐述统一系统如何自动调度。

在媒体预发布会上,OpenAI CEO萨姆·奥尔特曼将模型路由器誉为解决“模型选择难题”的方案。他表示旧版模型选择界面是“一团糟,令人迷惑”。

但法尔肯认为,核心问题在于GPT-5未带来跨越式提升。“从GPT-1到2、3、4,每次迭代都有巨大飞跃。而第四代到第五代的改进微乎其微,这才是用户不满的根源。”

多模型叠加能否实现AGI?

关于模型路由的争议引发部分人士批评当前对通用人工智能(AGI)即将实现的过度炒作。OpenAI官方将AGI定义为“在大多数具有经济价值的工作中超越人类的高度自主系统”,但奥尔特曼上周特别强调该术语“实用性不足”。

TensorOpera联合创始人、AI研究员何朝阳在X平台发文批评GPT-5的发布称:“承诺的AGI在哪里?强大如OpenAI这样的公司也无力训练超大模型,被迫采用实时模型路由器。”

AI生产平台Anyscale的联合创始人罗伯特·西哈拉表示,AI领域仍在持续扩展,但全能型单体模型仍遥不可及。他表示:“很难打造出样样精通的全能模型。”这正是GPT-5依赖路由连接的模型网络而非单体架构的原因。

OpenAI曾表示希望未来整合为单一模型,但西哈拉强调混合系统具备实质优势:你可以逐步升级系统中的某个部分,不会影响其他部分的运行;这样既能获得大部分性能提升,又能避免重新训练整个庞大模型所带来的高昂成本和复杂性。因此他认为路由技术将长期存在。

何朝阳对此表示认同。理论上扩展定律依然成立,即更多数据与算力能提升模型性能,但在实际操作中,他认为AI的发展会在两种路径之间“螺旋式推进”:一方面是将多个专用模型通过路由机制组合使用,另一方面则是尝试将它们整合成一个统一的大模型。决定因素在于工程成本、算力与能源限制,以及商业压力。

对AGI的过度炒作也需要调整。法尔肯在谈及大语言模型的“大脑”时表示:“如果真有人做出接近AGI的东西,我不确定那是否会由一组权重参数来实现。如果是一组模型组合起来,整体看起来像是AGI,那也没问题。我们在这方面不要拘泥于纯粹主义。”(*)

译者:刘进龙

审校:汪皓

OpenAI’s GPT-5 announcement last week was meant to be a triumph—proof that the company was still the undisputed leader in AI—until it wasn’t. Over the weekend, a groundswell of pushback from customers turned the rollout into more than a PR firestorm: It became a product and trust crisis. Users lamented the loss of their favorite models, which had doubled as therapists, friends, and romantic partners. Developers complained of degraded performance. Industry critic Gary Marcus predictably called GPT-5 “overdue, overhyped, and underwhelming.”

The culprit, many argued, was hiding in plain sight: a new real-time model “router” that automatically decides which one of GPT-5’s several variants to spin up for every job. Many users assumed GPT-5 was a single model trained from scratch; in reality, it’s a network of models—some weaker and cheaper, others stronger and more expensive—stitched together. Experts say that approach could be the future of AI as large language models advance and become more resource-intensive. But in GPT-5’s debut, OpenAI demonstrated some of the inherent challenges in the approach and learned some important lessons about how user expectations are evolving in the AI era.

For all the benefits promised by model routing, many users of GPT-5 bristled at what they perceived as a lack of control. Some even suggested OpenAI might purposefully be trying to pull the wool over their eyes.

In response to the GPT-5 uproar, OpenAI moved quickly to bring back the main earlier model, GPT-4o, for pro users. It also said it fixed buggy routing, increased usage limits, and promised continual updates to regain user trust and stability.

Anand Chowdhary, cofounder of AI sales platform FirstQuadrant, summed the situation up bluntly: “When routing hits, it feels like magic. When it whiffs, it feels broken.”

The promise and inconsistency of model routing

Jiaxuan You, an assistant professor of computer science at the University of Illinois Urbana-Champaign, told Fortune his lab has studied both the promise—and the inconsistency—of model routing. In GPT-5’s case, he said, he believes (though he can’t confirm) that the model router sometimes sends parts of the same query to different models. A cheaper, faster model might give one answer while a slower, reasoning-focused model gives another, and when the system stitches those responses together, subtle contradictions slip through.

The model routing idea is intuitive, he explained, but “making it really work is very nontrivial.” Perfecting a router, he added, can be as challenging as building Amazon-grade recommendation systems, which take years and many domain experts to refine. “GPT-5 is supposed to be built with maybe orders of magnitude more resources,” he explained, pointing out that even if the router picks a smaller model, it shouldn’t produce inconsistent answers.

Still, You believes routing is here to stay. “The community also believes model routing is promising,” he said, pointing to both technical and economic reasons. Technically, single-model performance appears to be hitting a plateau: You pointed to the commonly cited scaling laws, which says when we have more data and compute, the model gets better. “But we all know that the model wouldn’t get infinitely better,” he said. “Over the past year, we have all witnessed that the capacity of a single model is actually saturating.”

Economically, routing lets AI providers keep using older models rather than discarding them when a new one launches. Current events require frequent updates, but static facts remain accurate for years. Directing certain queries to older models avoids wasting the enormous time, compute, and money already spent on training them.

There are hard physical limits, too. GPU memory has become a bottleneck for training ever-larger models, and chip technology is approaching the maximum memory that can be packed onto a single die. In practice, You explained, physical limits mean the next model can’t be 10 times bigger.

An older idea that is now being hyped

William Falcon, founder and CEO of AI platform Lightning AI, points out that the idea of using an ensemble of models is not new—it has been around since around 2018—and since OpenAI’s models are a black box, we don’t know that GPT-4 did not also use a model routing system.

“I think maybe they’re being more explicit about it now, potentially,” he said. Either way, the GPT-5 launch was heavily hyped up—including the model routing system. The blog post introducing the model called it the “smartest, fastest, and most useful model yet, with thinking built in.” In the official ChatGPT blog post, OpenAI confirmed that GPT 5 within ChatGPT runs on a system of models coordinated by a behind-the-scenes router that switches to deeper reasoning when needed. The GPT 5 System Card went further, clearly outlining multiple model variants—gpt 5 main, gpt 5 main mini for speed, and gpt 5 thinking, gpt 5 thinking mini, plus a thinking pro version—and explains how the unified system automatically routes between them.

In a press pre-briefing, OpenAI CEO Sam Altman touted the model router as a way to tackle what had been a hard-to-decipher list of models to choose from. Altman called the previous model picker interface a “very confusing mess.”

But Falcon said the core problem was that GPT-5 simply didn’t feel like a leap. “GPT-1 to 2 to 3 to 4—each time was a massive jump. Four to five was not noticeably better. That’s what people are upset about.”

Will multiple models add up to AGI?

The debate over model routing led some to call out the ongoing hype over the possibility of artificial general intelligence, or AGI, being developed soon. OpenAI officially defines AGI as “highly autonomous systems that outperform humans at most economically valuable work,” but Altman notably said last week that it is “not a super useful term.”

“What about the promised AGI?” wrote Aiden Chaoyang He, an AI researcher and cofounder of TensorOpera, on X, criticizing the GPT-5 rollout. “Even a powerful company like OpenAI lacks the ability to train a super-large model, forcing them to resort to the Real-time Model Router.”

Robert Nishihara, co-founder of AI production platform Anyscale, says scaling is still progressing in AI, but the idea of one all-powerful AI model remains elusive. “It’s hard to build one model that is the best at everything,” he said. That’s why GPT-5 currently runs on a network of models linked by a router, not a single monolith.

OpenAI has said it hopes to unify these into one model in the future, but Nishihara points out that hybrid systems have real advantages: You can upgrade one piece at a time without disrupting the rest, and you get most of the benefits without the cost and complexity of retraining an entire giant model. As a result, Nishihara thinks routing will stick around.

Aiden Chaoyang He agrees. In theory, scaling laws still hold—more data and compute make models better—but in practice, he believes development will “spiral” between two approaches: routing specialized models together, then trying to consolidate them into one. The deciding factors will be engineering costs, compute and energy limits, and business pressures.

The hyped-up AGI narrative may need to adjust, too. “If anyone does anything that’s close to AGI, I don’t know if it’ll literally be one set of weights doing it,” Falcon said, referring to the “brains” behind LLMs. “If it’s a collection of models that feels like AGI, that’s fine. No one’s a purist here.”

*