首页 / 财富中文网 / 正文

斯坦福教授两年前改用纸质试卷考试,称因学生们强烈要求

财富中文网 2025-09-10 00:38:49

斯坦福大学(Stanford University)计算机科学教授尤雷·莱斯科夫对日新月异的技术变革并不陌生。莱斯科夫教授是拥有近三十年经验的机器学习研究者,执教近二十年,他同时也是迄今为止已融资3,700万美元的初创公司Kumo的联合创始人。

但两年前,当新一轮人工智能浪潮开始重塑教育行业时,莱斯科夫向《财富》杂志坦言,他所在的领域突然涌入主流视野令他震惊。他表示斯坦福的计算机科学项目享有盛誉,让他仿佛“目睹未来诞生,甚至预见未来成形”,但GPT-3的公开问世仍带来了强烈冲击。

莱斯科夫表示:“几年前,学生们普遍存在,怎么说呢,一种严重的存在主义危机,当时大家对自己在这个世界的定位感到迷茫。”

他指出AI技术的突破呈指数级增长,最终“将完全取代人类进行研究,那人类又将何去何从?”他表示,他曾花费大量时间与博士生们探讨如何规划自我发展,甚至讨论他们在未来世界中的角色。他形容这个过程充满“存在主义焦虑”与“意外冲击”。而后,他又迎来了另一个意外:学生们主动要求改变考核方式。

他表示:“这是群体自发的行动。”他特别提到了计算机科学专业本科毕业的助教。他们的提议很简单:“我们要求进行纸质考试。”

AI作为变革催化剂

莱斯科夫是斯坦福大学的杰出研究员,而且是专攻图结构数据与生物学领域AI应用的权威专家。他回忆起这场转变时,言语间带着惊喜与深思。他的课历来采用居家开卷考试模式,允许学生查阅教材和网络资源。只要不使用他人代码与解题方案,其余方式皆被允许。但随着OpenAI的GPT-3和GPT-4等大语言模型横空出世,学生与助教们开始质疑传统考核方式是否需要变革。

他坦言,如今他与助教团队的工作量大幅增加,批改纸质试卷耗时“远超以往”。但所有人一致认为,这才是检验学生真实知识水平的最佳方式。对这位AI领域资深学者而言,AI时代的到来竟意外地增加了他和其他人的工作负担。他戏称除了因大量打印试卷导致“全球树木减少”外,AI简直创造了“额外工作量”。虽然面对400人规模课堂如同置身“摇滚演唱会”,但他强调绝不会借助AI批改试卷。

他坚定表示:“绝不会使用AI,我们坚持人工批改。”

学生主导的解决方案

当前,关于AI如何改变高等教育的争论正酣,而莱斯科夫的做法恰恰处在这场争论的风口浪尖。随着AI作弊现象泛滥,众多高校全面禁止使用AI,部分教授也纷纷回归纸质考试,重启90后记忆中最经典的高中时代蓝色答题本。一位纽约大学(New York University)教授甚至提出“回归中世纪”,采用口试与笔试等传统考核形式。而身为AI专家的莱斯科夫,为应对AI时代采取的做法竟是同样回避AI的考核方式。

当被问及是否担心学生用AI作弊时,莱斯科夫反问道:“你会担心学生用计算器作弊吗?这就像数学考试允许使用计算器,如果不允许使用计算器,就会采用截然不同的考题。”他将AI比作计算器,称其是惊人强大的工具,“它的突然出现令我们措手不及”,但他同时指出“它仍非常不完善……我们需要学会使用这个工具,既要测试人类运用工具的能力,更要检验人类独立思考的能力。”

何为AI技能?何为人类技能?

莱斯科夫正在思考一个关乎所有劳动者的核心问题:何为人类技能?何为AI技能?二者在什么情况下可以融合?麻省理工学院(MIT)教授戴维·奥托与谷歌(Google)高级副总裁詹姆斯·马尼卡在《大西洋月刊》撰文指出,计算器或AI等工具可分为两类:自动化工具与协作工具。譬如洗碗机属于前者,文字处理器则属于后者。协作工具“需要人类参与”,而AI的特殊性在于它“无法被简单归类于任一类别”。

就业市场对AI应用的反馈,犹如魔力八号球的回答:“答案模糊不清,请稍后再试。”联邦就业报告显示自春季以来就业增长疲软,8月仅新增2.2万个岗位,远低于预期。多数经济学家将招聘停滞归因于唐纳德·特朗普关税政策的不确定性。该政策已被多家法院裁定非法,现正移交最高法院审理。但企业层面的AI落地同样进展不顺:麻省理工学院研究(与奥托无关)发现95%的生成式AI试点项目失败;随后,斯坦福大学研究指出初级岗位招聘开始萎缩,尤其是易被AI自动化替代的职位。

另一方面,自由职业平台Upwork刚发布的首份月度招聘报告揭示了哪些非全职工作类型受到市场青睐。答案是“AI技能”需求激增:即使企业不招聘全职员工,也争相聘请高薪高技能的自由职业者。

尽管整体劳动力市场疲软,Upwork发现企业正“战略性地利用灵活人才填补临时人力缺口”,大企业推动平台高价值工作(合同金额超1,000美元)增长31%。中小企业纷纷涌入“AI技能”领域,对AI与机器学习的需求暴涨40%。但Upwork同时发现,企业对擅长与AI协作的中间地带技能人才的需求增长。

Upwork指出,AI正通过创造高价值工作的专业技能需求来“放大人才价值”,这一趋势在创意设计、写作和翻译领域尤为明显。8月招聘中最热门的技能之一是事实核查能力,因为“需要人工验证AI输出内容”。

Upwork研究院常务董事凯利·莫纳汉指出:“人类正在重新回归与AI协作的循环。”

她表示:“我们确实发现人类技能的价值正在提升。”她认为人们已逐渐意识到AI在相当多的时间里会产生幻觉,因此无法完全取代人类的参与。“人们在使用AI生成内容时逐渐意识到,事实核查已成为必要环节。”

莫纳汉进一步表示,“AI技能”的演变态势表明她所称的“领域专业知识”正变得愈发珍贵。她表示,法律类工作在8月出现增长,表明对AI生成的法律文本进行事实核查必须依赖专业法律知识。若缺乏特定领域的高级技能,人们“很容易被AI生成内容欺骗”,因此,企业正为了防范这种风险而增聘专业人才。

当前,劳动力市场出现了一个明显的技能鸿沟,一方面,初级求职者就业难,另一方面,企业AI落地难。莱斯科夫对此表示认同。

“我认为我们几乎需要对劳动力进行再培训。人类专业知识比以往任何时候都更重要。”他补充道,初级岗位就业难题是“症结所在”,因为年轻劳动者们究竟要如何获得与AI有效协作所需的领域专业知识呢?

莱斯科夫表示:“这需要回归教学体系,通过再培训和课程改革实现。”他指出高校与企业都需发挥作用。他反问道:如果企业不愿意吸纳年轻劳动者并投入时间培养,又如何能拥有资深技能人才?

当《财富》杂志邀请莱斯科夫从整体上评价学生、教授和职场人士的AI应用现状时,他说目前仍处于“非常早期的阶段”。他表示,我们正处于“解决方案探索期”,诸如人工批改试卷、教授寻找验证学生知识掌握情况的新方法等都属于这个阶段的探索。(*)

译者:刘进龙

审校:汪皓

斯坦福大学(Stanford University)计算机科学教授尤雷·莱斯科夫对日新月异的技术变革并不陌生。莱斯科夫教授是拥有近三十年经验的机器学习研究者,执教近二十年,他同时也是迄今为止已融资3,700万美元的初创公司Kumo的联合创始人。

但两年前,当新一轮人工智能浪潮开始重塑教育行业时,莱斯科夫向《财富》杂志坦言,他所在的领域突然涌入主流视野令他震惊。他表示斯坦福的计算机科学项目享有盛誉,让他仿佛“目睹未来诞生,甚至预见未来成形”,但GPT-3的公开问世仍带来了强烈冲击。

莱斯科夫表示:“几年前,学生们普遍存在,怎么说呢,一种严重的存在主义危机,当时大家对自己在这个世界的定位感到迷茫。”

他指出AI技术的突破呈指数级增长,最终“将完全取代人类进行研究,那人类又将何去何从?”他表示,他曾花费大量时间与博士生们探讨如何规划自我发展,甚至讨论他们在未来世界中的角色。他形容这个过程充满“存在主义焦虑”与“意外冲击”。而后,他又迎来了另一个意外:学生们主动要求改变考核方式。

他表示:“这是群体自发的行动。”他特别提到了计算机科学专业本科毕业的助教。他们的提议很简单:“我们要求进行纸质考试。”

AI作为变革催化剂

莱斯科夫是斯坦福大学的杰出研究员,而且是专攻图结构数据与生物学领域AI应用的权威专家。他回忆起这场转变时,言语间带着惊喜与深思。他的课历来采用居家开卷考试模式,允许学生查阅教材和网络资源。只要不使用他人代码与解题方案,其余方式皆被允许。但随着OpenAI的GPT-3和GPT-4等大语言模型横空出世,学生与助教们开始质疑传统考核方式是否需要变革。

他坦言,如今他与助教团队的工作量大幅增加,批改纸质试卷耗时“远超以往”。但所有人一致认为,这才是检验学生真实知识水平的最佳方式。对这位AI领域资深学者而言,AI时代的到来竟意外地增加了他和其他人的工作负担。他戏称除了因大量打印试卷导致“全球树木减少”外,AI简直创造了“额外工作量”。虽然面对400人规模课堂如同置身“摇滚演唱会”,但他强调绝不会借助AI批改试卷。

他坚定表示:“绝不会使用AI,我们坚持人工批改。”

学生主导的解决方案

当前,关于AI如何改变高等教育的争论正酣,而莱斯科夫的做法恰恰处在这场争论的风口浪尖。随着AI作弊现象泛滥,众多高校全面禁止使用AI,部分教授也纷纷回归纸质考试,重启90后记忆中最经典的高中时代蓝色答题本。一位纽约大学(New York University)教授甚至提出“回归中世纪”,采用口试与笔试等传统考核形式。而身为AI专家的莱斯科夫,为应对AI时代采取的做法竟是同样回避AI的考核方式。

当被问及是否担心学生用AI作弊时,莱斯科夫反问道:“你会担心学生用计算器作弊吗?这就像数学考试允许使用计算器,如果不允许使用计算器,就会采用截然不同的考题。”他将AI比作计算器,称其是惊人强大的工具,“它的突然出现令我们措手不及”,但他同时指出“它仍非常不完善……我们需要学会使用这个工具,既要测试人类运用工具的能力,更要检验人类独立思考的能力。”

何为AI技能?何为人类技能?

莱斯科夫正在思考一个关乎所有劳动者的核心问题:何为人类技能?何为AI技能?二者在什么情况下可以融合?麻省理工学院(MIT)教授戴维·奥托与谷歌(Google)高级副总裁詹姆斯·马尼卡在《大西洋月刊》撰文指出,计算器或AI等工具可分为两类:自动化工具与协作工具。譬如洗碗机属于前者,文字处理器则属于后者。协作工具“需要人类参与”,而AI的特殊性在于它“无法被简单归类于任一类别”。

就业市场对AI应用的反馈,犹如魔力八号球的回答:“答案模糊不清,请稍后再试。”联邦就业报告显示自春季以来就业增长疲软,8月仅新增2.2万个岗位,远低于预期。多数经济学家将招聘停滞归因于唐纳德·特朗普关税政策的不确定性。该政策已被多家法院裁定非法,现正移交最高法院审理。但企业层面的AI落地同样进展不顺:麻省理工学院研究(与奥托无关)发现95%的生成式AI试点项目失败;随后,斯坦福大学研究指出初级岗位招聘开始萎缩,尤其是易被AI自动化替代的职位。

另一方面,自由职业平台Upwork刚发布的首份月度招聘报告揭示了哪些非全职工作类型受到市场青睐。答案是“AI技能”需求激增:即使企业不招聘全职员工,也争相聘请高薪高技能的自由职业者。

尽管整体劳动力市场疲软,Upwork发现企业正“战略性地利用灵活人才填补临时人力缺口”,大企业推动平台高价值工作(合同金额超1,000美元)增长31%。中小企业纷纷涌入“AI技能”领域,对AI与机器学习的需求暴涨40%。但Upwork同时发现,企业对擅长与AI协作的中间地带技能人才的需求增长。

Upwork指出,AI正通过创造高价值工作的专业技能需求来“放大人才价值”,这一趋势在创意设计、写作和翻译领域尤为明显。8月招聘中最热门的技能之一是事实核查能力,因为“需要人工验证AI输出内容”。

Upwork研究院常务董事凯利·莫纳汉指出:“人类正在重新回归与AI协作的循环。”

她表示:“我们确实发现人类技能的价值正在提升。”她认为人们已逐渐意识到AI在相当多的时间里会产生幻觉,因此无法完全取代人类的参与。“人们在使用AI生成内容时逐渐意识到,事实核查已成为必要环节。”

莫纳汉进一步表示,“AI技能”的演变态势表明她所称的“领域专业知识”正变得愈发珍贵。她表示,法律类工作在8月出现增长,表明对AI生成的法律文本进行事实核查必须依赖专业法律知识。若缺乏特定领域的高级技能,人们“很容易被AI生成内容欺骗”,因此,企业正为了防范这种风险而增聘专业人才。

当前,劳动力市场出现了一个明显的技能鸿沟,一方面,初级求职者就业难,另一方面,企业AI落地难。莱斯科夫对此表示认同。

“我认为我们几乎需要对劳动力进行再培训。人类专业知识比以往任何时候都更重要。”他补充道,初级岗位就业难题是“症结所在”,因为年轻劳动者们究竟要如何获得与AI有效协作所需的领域专业知识呢?

莱斯科夫表示:“这需要回归教学体系,通过再培训和课程改革实现。”他指出高校与企业都需发挥作用。他反问道:如果企业不愿意吸纳年轻劳动者并投入时间培养,又如何能拥有资深技能人才?

当《财富》杂志邀请莱斯科夫从整体上评价学生、教授和职场人士的AI应用现状时,他说目前仍处于“非常早期的阶段”。他表示,我们正处于“解决方案探索期”,诸如人工批改试卷、教授寻找验证学生知识掌握情况的新方法等都属于这个阶段的探索。(*)

译者:刘进龙

审校:汪皓

Stanford University computer science professor Jure Leskovec is no stranger to rapid technological change. A machine-learning researcher for nearly three decades and well into his second decade of teaching, he’s also the co-founder of Kumo, a startup with $37 million in funding raised to date.

But two years ago, as the latest wave of artificial intelligence began reshaping education, Leskovec told Fortune he was rocked by the explosion of his field into the mainstream. He said Stanford has such a prestigious computer science program he feels as if he “sees the future as it’s being born, or even before the future is born,” but the public release of GPT-3 was jarring.

“We had a big, I don’t know, existential crisis among students a few years back when it kind of wasn’t clear what our role is in this world,” Leskovec said.

He said it seemed like breakthroughs in AI would be exponential to the point where “it will just do research for us, so what do we do?” He said he spent a lot of time talking with students at the PhD level about how to organize themselves, even about what their role in the world would be going forward. It was “existential” and “surprising,” he said. Then, he received another surprise: a student-led request for a change in testing.

“It came out of the group,” he said, especially the teaching assistants, the previous generation of computer science undergraduates. Their idea was simple: “We do a paper exam.”

AI as catalyst for change

Leskovec, a prominent researcher at Stanford whose expertise lies in graph-structured data and AI applications in biology, recounted the pivot with a mixture of surprise and thoughtfulness. Historically, his classes had relied on open-book, take-home exams, where students could leverage textbooks and the internet. They couldn’t use other people’s code and solutions, but the rest was fair game. As large language models like OpenAI’s GPT-3 and GPT-4 exploded onto the scene, students and teaching assistants alike began questioning whether assessments ought to be handled differently.

Now it’s a lot more work for him and his TAs, he said, saying these exams take “much longer” to grade. But they all agreed it was the best way to actually test student knowledge. The age of AI for Leskovec, an AI veteran, has surprised him by putting a higher workload back on himself and other humans. Besides there being “fewer trees in the world” from all the paper he’s printing out, he said AI has just created “additional work.” His 400-person classes feel like an audience at a “rock concert,” but he insisted he’s not turning to AI for help synthesizing and analyzing all the exams.

“No, no, no, we hand grade,” he insisted.

A student-driven solution

Leskovec’s solution sits squarely in the middle of a raging debate about how AI is changing higher education, as reports of rampant cheating have led many colleges to ban the use of AI outright. Other professors are turning back to the paper exam, reviving the famous blue books of many ’90s kids’ memories of high school. One New York University professor even suggested getting “medieval,” embracing ancient forms of testing such as oral and written examination. In the case of Leskovec, the AI professor’s solution for the AI age is likewise to turn away from AI for testing.

When asked if he was worried about students cheating with AI, Leskovec posed another question: “Are you worried about students cheating with calculators? It’s like if you allow a calculator in your math exam, and you will have a different exam if you say calculators are disallowed.” Likening AI to a calculator, he said AI is an amazingly powerful tool that “kind of just emerged and surprised us all,” but it’s also “very imperfect … we need to learn how to use this tool, and we need to be able to both test the humans being able to use the tool and humans being able to think by themselves.”

What is an AI skill and what is a human skill?

Leskovec is wrestling with a question that touches everyone in the workforce: What is a human skill, what is an AI skill, and where do they merge? MIT professor David Autor and Google SVP James Manyika argued in The Atlantic tools like a calculator or AI generally fall into two buckets: automation and collaboration. Think dishwasher, on the one hand, or word processor, on the other. The collaboration tool “requires human engagement” and the issue with AI is that it “does not go neatly into either [bucket].”

The jobs market is sending a message on AI implementation that equates to something like a response from the Magic 8 Ball: “Reply hazy. Try again later.” The federal jobs report has revealed anemic growth since the spring, most recently disappointing expectations with a print of just 22,000 jobs in August. Most economists attribute the lack of hiring to uncertainty about President Donald Trump’s tariff regime, which multiple courts have ruled illegal and appears to be heading to the Supreme Court. But AI implementation is not going smoothly at the corporate level, with an MIT study (not connected to Autor) finding 95% of generative AI pilots are failing, followed shortly after by a Stanford study finding the beginning of a collapse in hiring at the entry level, especially in jobs exposed to automation by AI.

For another perspective, the freelance marketplace Upwork just launched its inaugural monthly hiring report, revealing what non-full-time jobs are being rewarded by the market. The answer is “AI skills” are super in-demand and, even if companies aren’t hiring full-time employees, they are piling into highly paid and highly skilled freelance labor.

Despite a softer overall labor market, Upwork finds companies are “strategically leveraging flexible talent to address temporary gaps in the workforce,” with large businesses driving a 31% growth in what Upwork calls high-value work (contracts greater than $1,000) on the platform. Smaller and medium-sized businesses are piling into “AI skills,” with demand for AI and machine learning leaping by 40%. But Upwork also sees growing demand for the kind of skills that fall in between: a human who is good at collaborating with AI.

Upwork says AI is “amplifying human talent” by creating demand for expertise in higher-value work, most visible across the creative and design, writing, and translation categories. One of the top skills hired for in August was fact-checking, given “the need for human verification of AI outputs.”

Kelly Monahan, managing director of the Upwork Research Institute, said “humans are coming right back in the loop” of working with AI.

“We’re actually seeing the human skills coming into premium,” she said, adding she thinks people are realizing AI hallucinates too much of the time to completely replace human involvement. “I think what people are seeing, now that they’re using AI-generated content, is that they need fact-checking.”

Extending this line of thinking, Monahan said the evolving landscape of “AI skills” shows what she calls “domain expertise” is growing increasingly valuable. Legal is a category that grew in August, she said, highlighting legal expertise is required to fact-check AI-generated legal writing. If you don’t have advanced skills in a particular domain, “it’s easy to be fooled” by AI-generated content, and businesses are hiring to protect against that.

Leskovec agreed when asked about the skills gap that appears to be facing entry-level workers trying to get hired, on the one hand, and companies struggling to effectively implement AI.

“I think we almost need to re-skill the workforce. Human expertise matters much more than it ever did [before].” He added the entry-level issue is “the crux of the problem,” because how are young workers supposed to get the domain expertise required to effectively collaborate with AI?

“I think it goes back to teaching, reskilling, rethinking our curricula,” Leskovec said, adding colleges have a role to play, but organizations do, as well. He asked a rhetorical question: How are they supposed to have senior skilled workers if they’re not taking in young workers and taking the time to train them?

When asked by Fortune to survey the landscape and assess where we are right now in using AI, as students, professors and workers, Leskovec said we are “very early in this.” He said he thinks we’re in the “coming-up-with-solutions phase.” Solutions like a hand-graded exam and a professor finding news ways to fact-check his students’ knowledge.

*