Newsroom Anthropic 09月13日
核技术管控与AI合作
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

核技术具有双重用途,其原理可被用于武器开发。随着AI模型能力增强,需警惕其传播危险技术知识威胁国家安全。核武器信息敏感,评估风险对私营公司挑战,因此Anthropic与美能源部国家核安全局合作评估模型风险。现联合开发AI分类器,自动识别核相关对话,初步测试准确率达96%。该分类器已部署于Claude系统,并与行业组织分享,以推广类似安全措施。此举展示公私合作力量,提升AI模型可靠性。

🔬 核技术具有双重用途,其原理可被用于武器开发,需警惕AI模型传播危险技术知识威胁国家安全。

🤝 Anthropic与美能源部国家核安全局合作评估AI模型风险,联合开发AI分类器自动识别核相关对话,初步测试准确率达96%。

🔐 该分类器已部署于Claude系统,并与行业组织分享,以推广类似安全措施,提升AI模型可靠性,展示公私合作力量。

📊 早期部署数据显示分类器在真实Claude对话中效果良好,此举旨在使AI模型更安全、更值得信赖。

🔗 详细信息可在Anthropic的red.anthropic.com博客上找到,该博客发布前沿AI模型对国家安全影响的最新研究。

Nuclear technology is inherently dual-use: the same physics principles that power nuclear reactors can be misused for weapons development. As AI models become more capable, we need to keep a close eye on whether they can provide users with dangerous technical knowledge in ways that could threaten national security.

Information relating to nuclear weapons is particularly sensitive, which makes evaluating these risks challenging for a private company acting alone. That’s why last April we partnered with the U.S. Department of Energy (DOE)’s National Nuclear Security Administration (NNSA) to assess our models for nuclear proliferation risks and continue to work with them on these evaluations.

Now, we’re going beyond assessing risk to build the tools needed to monitor for it. Together with the NNSA and DOE national laboratories, we have co-developed a classifier—an AI system that automatically categorizes content—that distinguishes between concerning and benign nuclear-related conversations with 96% accuracy in preliminary testing.

We have already deployed this classifier on Claude traffic as part of our broader system for identifying misuse of our models. Early deployment data suggests the classifier works well with real Claude conversations.

We will share our approach with the Frontier Model Forum, the industry body for frontier AI companies, in hopes that this partnership can serve as a blueprint that any AI developer can use to implement similar safeguards in partnership with NNSA.

Along with the concrete importance of securing frontier AI models against nuclear misuse, this first-of-its-kind effort shows the power of public-private partnerships. These partnerships combine the complementary strengths of industry and government to address risks head-on, making AI models more reliable and trustworthy for all their users.

Full details about our NNSA partnership and the safeguards development can be found on our red.anthropic.com blog, the home for research from Anthropic’s Frontier Red Team (and occasionally other teams at Anthropic) on what frontier AI models mean for national security. Click here to read more.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

核技术管控 AI安全 公私合作 国家核安全局 AI风险防范
相关文章