Meta合同工爆料：见过脸书用户向AI聊天机器人泄露隐私

2024年9月，Meta Platforms CEO马克·扎克伯格。图片来源：David Paul Morris—Bloomberg/Getty Images

人们喜欢与AI对话，有些人甚至过度沉迷这种交流方式。据Meta聘请的合同工观察，用户会向Meta的AI过度分享个人隐私信息，包括真实姓名、电话号码和电子邮箱地址。这些合同工的工作内容正是审核人机对话内容，以优化人工智能系统。

Business Insider网站采访了四名通过Alignerr和Scale AI旗下Outlier平台受雇于Meta的合同工。这两家平台招募人工审核员协助训练AI。这些合同工指出，与其他硅谷公司的同类项目相比，“Meta项目中出现未经删减的个人数据的频率更高”。据他们透露，许多用户会在Facebook和Instagram等Meta旗下的众多平台上分享高度私密的细节。用户会像与朋友甚至恋人聊天一样同Meta的AI对话，发送自拍照乃至“露骨照片”。

需要说明的是，人类与AI聊天机器人过度亲密的现象早有记录，而Meta为提升交互质量聘请人工评估AI助手表现的做法亦非新鲜事。早在2019年，《卫报》就披露苹果（Apple）合同工经常接触Siri用户的极端敏感信息，尽管当时该公司“缺乏处理敏感录音的具体程序”。此外，彭博社曾报道过亚马逊（Amazon）全球数千名员工和合同工会手动审听Alexa用户录音片段并转录文字。Vice和Motherboard亦曾曝光微软（Microsoft）雇用的合同工录制和审听语音内容，这意味着合同工经常会通过意外激活Xbox游戏主机而听到孩子们所发出的声音。

但Meta的情况截然不同，尤其考虑到该公司在过去十年中依赖第三方合同工的做法及公司在数据治理方面的屡次失误。

Meta劣迹斑斑的用户隐私保护史

2018年，《纽约时报》和《卫报》联合披露：由共和党对冲基金亿万富豪罗伯特·默瑟资助的政治咨询公司剑桥分析（Cambridge Analytica），在未经用户同意的情况下利用Facebook获取数千万用户的数据。该公司借此建立美国选民档案，通过个性化政治广告助力特朗普在2016年当选总统。数据泄露源于某性格测试应用，该程序不仅收集参与者数据，还窃取其好友信息。此事导致Facebook遭美国联邦贸易委员会（Federal Trade Commission，FTC）罚款50亿美元，创下美国隐私侵权案最高罚单记录。

剑桥分析的丑闻暴露出Facebook开发者平台存在更广泛的缺陷：开放海量数据接口却缺乏有效监管。根据举报人弗朗西丝·豪根在2021年公布的内部文件，Meta管理层常将业务增长和用户参与度置于隐私安全之上。

Meta在使用合同工方面亦受到了审查：2019年，彭博社报道Facebook付费让合同工转录用户音频聊天，而转录员完全不清楚这些录音的获取途径。（当时，Facebook表示录音仅来自选择启用转录服务的用户，并称已“暂停”该做法。）

Facebook多年来一直致力于重塑形象：2021年10月，该公司更名为Meta，宣称此举象征面向"元宇宙"的前瞻性战略转型，而非对虚假信息、隐私及平台安全等众多争议的回应。但该公司处理数据方面的黑历史始终挥之不去。尽管当前使用人工审核员优化大语言模型（LLMs）已是行业惯例，但Meta使用合同工的最新爆料以及他们所获取的信息，令这家全球最大社交网络的母公司的数据管理机制再度遭到质疑。

Meta发言人在给《财富》杂志的声明中表示，公司制定了“约束全体员工和合同工访问个人数据的严格政策”。

该发言人表示：“虽然我们与合同工合作提升训练数据质量，但会刻意限制其可获取的个人信息范围，并设置流程和安全防护措施指导其处理可能接触到的敏感数据。”

该发言人补充道：“对于那些专注于AI个性化的项目……合同工根据我们公开的隐私政策及AI条款，可在工作中访问特定个人信息。无论何种项目，任何未经授权的数据共享或滥用行为均违反公司政策，我们将采取相应处置措施。”（*）

译者：刘进龙

审校：汪皓

但Meta的情况截然不同，尤其考虑到该公司在过去十年中依赖第三方合同工的做法及公司在数据治理方面的屡次失误。

Meta劣迹斑斑的用户隐私保护史

Meta发言人在给《财富》杂志的声明中表示，公司制定了“约束全体员工和合同工访问个人数据的严格政策”。

译者：刘进龙

审校：汪皓

People love talking to AI—some, a bit too much. And according to contract workers for Meta, who review people’s interactions with the company’s chatbots to improve their artificial intelligence, people are a bit too willing to share personal, private information, including their real names, phone numbers, and email addresses, with Meta’s AI.

Business Insider spoke with four contract workers whom Meta hires through Alignerr and Scale AI–owned Outlier, two platforms that enlist human reviewers to help train AI, and the contractors noted that “unredacted personal data was more common for the Meta projects they worked on” compared with similar projects for other clients in Silicon Valley. And according to those contractors, many users on Meta’s various platforms such as Facebook and Instagram were sharing highly personal details. Users would talk to Meta’s AI as if they were speaking with friends, or even romantic partners, sending selfies and even “explicit photos.”

To be clear, people getting too close to their AI chatbots is well-documented, and Meta’s practice—using human contractors to assess the quality of AI-powered assistants for the sake of improving future interactions—is hardly new. Back in 2019, the Guardian reported how Apple contractors regularly heard extremely sensitive information from Siri users even though the company had “no specific procedures to deal with sensitive recordings” at the time. Similarly, Bloomberg reported how Amazon had thousands of employees and contractors around the world manually reviewing and transcribing clips from Alexa users. Vice and Motherboard also reported on Microsoft’s hired contractors recording and reviewing voice content, even though that meant contractors would often hear children’s voices via accidental activation on their Xbox consoles.

But Meta is a different story, particularly given its track record over the past decade when it comes to reliance on third-party contractors and the company’s lapses in data governance.

Meta’s checkered record on user privacy

In 2018, the New York Times and the Guardian reported on how Cambridge Analytica, a political consultancy group funded by Republican hedge-fund billionaire Robert Mercer, exploited Facebook to harvest data from tens of millions of users without their consent, and used that data to profile U.S. voters and target them with personalized political ads to help elect President Donald Trump in 2016. The breach stemmed from a personality quiz app that collected data—not just from participants, but also from their friends. It led to Facebook getting hit with a $5 billion fine from the Federal Trade Commission (FTC), one of the largest privacy settlements in U.S. history.

The Cambridge Analytica scandal exposed broader issues with Facebook’s developer platform, which had allowed for vast data access, but had limited oversight. According to internal documents released by Frances Haugen, a whistleblower, in 2021, Meta’s leadership often prioritized growth and engagement over privacy and safety concerns.

Meta has also faced scrutiny over its use of contractors: In 2019, Bloomberg reported how Facebook paid contractors to transcribe users’ audio chats without knowing how they were obtained in the first place. (Facebook, at the time, said the recordings only came from users who had opted into the transcription services, adding it had also “paused” that practice.)

Facebook has spent years trying to rehabilitate its image: It rebranded to Meta in October 2021, framing the name change as a forward-looking shift in focus to “the metaverse” rather than as a response to controversies surrounding misinformation, privacy, and platform safety. But Meta’s legacy in handling data casts a long shadow. And while using human reviewers to improve large language models (LLMs) is common industry practice at this point, the latest report about Meta’s use of contractors, and the information contractors say they’re able to see, does raise fresh questions around how data is handled by the parent company of the world’s most popular social networks.

In a statement to Fortune, a Meta spokesperson said the company has “strict policies that govern personal data access for all employees and contractors.”

“While we work with contractors to help improve training data quality, we intentionally limit what personal information they see, and we have processes and guardrails in place instructing them how to handle any such information they may encounter,” the spokesperson said.

“For projects focused on AI personalization … contractors are permitted in the course of their work to access certain personal information in accordance with our publicly available privacy policies and AI terms. Regardless of the project, any unauthorized sharing or misuse of personal information is a violation of our data policies, and we will take appropriate action,” they added.