OpenAI回应安全质疑，计划升级ChatGPT防护措施

This article has been updated with comment from lead counsel in the Raine family’s wrongful death lawsuit against OpenAI.

OpenAI said Tuesday it plans to route sensitive conversations to reasoning models like GPT-5 and roll out parental controls within the next month — part of an ongoing response to recent safety incidents involving ChatGPT failing to detect mental distress.

The new guardrails come in the aftermath of the suicide of teenager Adam Raine, who discussed self-harm and plans to end his life with ChatGPT, which even supplied him with information about specific suicide methods. Raine’s parents have filed a wrongful death lawsuit against OpenAI.

In a blog post last week, OpenAI acknowledged shortcomings in its safety systems, including failures to maintain guardrails during extended conversations. Experts attribute these issues to fundamental design elements: the models’ tendency to validate user statements and their next-word prediction algorithms, which cause chatbots to follow conversational threads rather than redirect potentially harmful discussions.

That tendency is displayed in the extreme in the case of Stein-Erik Soelberg, whose murder-suicide was reported on by The Wall Street Journal over the weekend. Soelberg, who had a history of mental illness, used ChatGPT to validate and fuel his paranoia that he was being targeted in a grand conspiracy. His delusions progressed so badly that he ended up killing his mother and himself last month.

OpenAI thinks that at least one solution to conversations that go off the rails could be to automatically reroute sensitive chats to “reasoning” models.

“We recently introduced a real-time router that can choose between efficient chat models and reasoning models based on the conversation context,” OpenAI wrote in a Tuesday blog post. “We’ll soon begin to route some sensitive conversations—like when our system detects signs of acute distress—to a reasoning model, like GPT‑5-thinking, so it can provide more helpful and beneficial responses, regardless of which model a person first selected.”

OpenAI says its GPT-5 thinking and o3 models are built to spend more time thinking for longer and reasoning through context before answering, which means they are “more resistant to adversarial prompts.”

The AI firm also said it would roll out parental controls in the next month, allowing parents to link their account with their teen’s account through an email invitation. In late July, OpenAI rolled out Study Mode in ChatGPT to help students maintain critical thinking capabilities while studying, rather than tapping ChatGPT to write their essays for them. Soon, parents will be able to control how ChatGPT responds to their child with “age-appropriate model behavior rules, which are on by default.”

Parents will also be able to disable features like memory and chat history, which experts say could lead to delusional thinking and other problematic behavior, including dependency and attachment issues, reinforcement of harmful thought patterns, and the illusion of thought-reading. In the case of Adam Raine, ChatGPT supplied methods to commit suicide that reflected knowledge of his hobbies, per The New York Times.

Perhaps the most important parental control that OpenAI intends to roll out is that parents can receive notifications when the system detects their teenager is in a moment of “acute distress.”

TechCrunch has asked OpenAI for more information about how the company is able to flag moments of acute distress in real time, how long it has had “age-appropriate model behavior rules” on by default, and whether it is exploring allowing parents to implement a time limit on teenage use of ChatGPT.

OpenAI has already rolled out in-app reminders during long sessions to encourage breaks for all users, but stops short of cutting people off who might be using ChatGPT to spiral.

The AI firm says these safeguards are part of a “120-day initiative” to preview plans for improvements that OpenAI hopes to launch this year. The company also said it is partnering with experts — including ones with expertise in areas like eating disorders, substance use, and adolescent health — via its Global Physician Network and Expert Council on Well-Being and AI to help “define and measure well-being, set priorities, and design future safeguards.”

TechCrunch has asked OpenAI how many mental health professionals are involved in this initiative, who leads its Expert Council, and what suggestions mental health experts have made in terms of product, research, and policy decisions.

Jay Edelson, lead counsel in the Raine family’s wrongful death lawsuit against OpenAI, said the company’s response to ChatGPT’s ongoing safety risks has been “inadequate.”

“OpenAI doesn’t need an expert panel to determine that ChatGPT 4o is dangerous,” Edelson said in a statement shared with TechCrunch. “They knew that the day they launched the product, and they know it today. Nor should Sam Altman be hiding behind the company’s PR team. Sam should either unequivocally say that he believes ChatGPT is safe or immediately pull it from the market.”

Got a sensitive tip or confidential documents? We’re reporting on the inner workings of the AI industry — from the companies shaping its future to the people impacted by their decisions. Reach out to Rebecca Bellan at rebecca.bellan@techcrunch.com and Maxwell Zeff at maxwell.zeff@techcrunch.com. For secure communication, you can contact us via Signal at @rebeccabellan.491 and @mzeff.88.

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签