Newsroom Anthropic 09月13日
Anthropic提交AI问责建议
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Anthropic向美国国家电信和信息管理局(NTIA)提交了关于AI问责的建议。建议包括加强AI模型评估、要求公司披露评估方法和结果、建立行业标准、进行风险响应评估、建立大型AI训练运行的预注册流程、授权第三方审计、强制外部红队测试、推进可解释性研究以及通过反垄断清晰度促进行业合作。这些措施旨在确保AI系统的安全发展和部署。

🔍Anthropic提交了关于AI问责的一些建议给NTIA,目的是确保AI系统的安全和负责任的发展。

📊建议加强AI模型评估,增加对AI模型评估研究的资金支持,并要求公司在短期内披露其评估方法和结果。

🏭建议在长期内制定一套行业评估标准和最佳实践,由政府机构如NIST制定标准,公司需遵守。

⚠️建议根据模型能力开发风险响应评估,对高风险的AI系统进行更严格的评估和监管。

📈建议建立大型AI训练运行的预注册流程,以便监管机构了解潜在风险,并保护聚合的注册数据。

🛡️建议授权第三方审计,要求审计员具备技术知识、安全意识和灵活性,以保护知识产权。

🔐建议强制外部红队测试,以标准化AI系统的对抗性测试,确保AI系统的安全性和鲁棒性。

🧠建议增加对可解释性研究的资金支持,推动小型模型的可解释性研究,以便在边界实验室之外取得进展。

🤝建议监管机构发布关于允许AI行业安全协调的指导方针,以减少法律不确定性并推进共同目标。


This week, Anthropic submitted a response to the National Telecommunications and Information Administration’s (NTIA) Request for Comment on AI Accountability. Today, we want to share our recommendations as they capture some of Anthropic’s core AI policy proposals.

There is currently no robust and comprehensive process for evaluating today’s advanced artificial intelligence (AI) systems, let alone the more capable systems of the future. Our submission presents our perspective on the processes and infrastructure needed to ensure AI accountability. Our recommendations consider the NTIA’s potential role as a coordinating body that sets standards in collaboration with other government agencies like the National Institute of Standards and Technology (NIST).

In our recommendations, we focus on accountability mechanisms suitable for highly capable and general-purpose AI models. Specifically, we recommend:

    Fund research to build better evaluations
      Increase funding for AI model evaluation research. Developing rigorous, standardized evaluations is difficult and time-consuming work that requires significant resources. Increased funding, especially from government agencies, could help drive progress in this critical area.Require companies in the near-term to disclose evaluation methods and results. Companies deploying AI systems should be mandated to satisfy some disclosure requirements with regard to their evaluations, though these requirements need not be made public if doing so would compromise intellectual property (IP) or confidential information. This transparency could help researchers and policymakers better understand where existing evaluations may be lacking.Develop in the long term a set of industry evaluation standards and best practices. Government agencies like NIST could work to establish standards and benchmarks for evaluating AI models’ capabilities, limitations, and risks that companies would comply with.
    Create risk-responsive assessments based on model capabilities
      Develop standard capabilities evaluations for AI systems. Governments should fund and participate in the development of rigorous capability and safety evaluations targeted at critical risks from advanced AI, such as deception and autonomy. These evaluations can provide an evidence-based foundation for proportionate, risk-responsive regulation.Develop a risk threshold through more research and funding into safety evaluations. Once a risk threshold has been established, we can mandate evaluations for all models against this threshold.
        If a model falls below this risk threshold, existing safety standards are likely sufficient. Verify compliance and deploy.If a model exceeds the risk threshold and safety assessments and mitigations are insufficient, halt deployment, significantly strengthen oversight, and notify regulators. Determine appropriate safeguards before allowing deployment.
    Establish pre-registration for large AI training runs
      Establish a process for AI developers to report large training runs ensuring that regulators are aware of potential risks. This involves determining the appropriate recipient, required information, and appropriate cybersecurity, confidentiality, IP, and privacy safeguards.Establish a confidential registry for AI developers conducting large training runs to pre-register model details with their home country’s national government (e.g., model specifications, model type, compute infrastructure, intended training completion date, and safety plans) before training commences. Aggregated registry data should be protected to the highest available standards and specifications.
    Empower third party auditors that are…
      Technically literate – at least some auditors will need deep machine learning experience;Security-conscious – well-positioned to protect valuable IP, which could pose a national security threat if stolen; andFlexible – able to conduct robust but lightweight assessments that catch threats without undermining US competitiveness.
    Mandate external red teaming before model release
      Mandate external red teaming for AI systems, either through a centralized third party (e.g., NIST) or in a decentralized manner (e.g., via researcher API access) to standardize adversarial testing of AI systems. This should be a precondition for developers who are releasing advanced AI systems.Establish high-quality external red teaming options before they become a precondition for model release. This is critical as red teaming talent currently resides almost exclusively within private AI labs.
    Advance interpretability research
      Increase funding for interpretability research. Provide government grants and incentives for interpretability work at universities, nonprofits, and companies. This would allow meaningful work to be done on smaller models, enabling progress outside frontier labs.Recognize that regulations demanding interpretable models would currently be infeasible to meet, but may be possible in the future pending research advances.
    Enable industry collaboration on AI safety via clarity around antitrust
      Regulators should issue guidance on permissible AI industry safety coordination given current antitrust laws. Clarifying how private companies can work together in the public interest without violating antitrust laws would mitigate legal uncertainty and advance shared goals.

We believe this set of recommendations will bring us meaningfully closer to establishing an effective framework for AI accountability. Doing so will require collaboration between researchers, AI labs, regulators, auditors, and other stakeholders. Anthropic is committed to supporting efforts to enable the safe development and deployment of AI systems. Evaluations, red teaming, standards, interpretability and other safety research, auditing, and strong cybersecurity practices are all promising avenues for mitigating the risks of AI while realizing its benefits.

We believe that AI could have transformative effects in our lifetime and we want to ensure that these effects are positive. The creation of robust AI accountability and auditing mechanisms will be vital to realizing this goal. We are grateful for the chance to respond to this Request For Comment.

You can read our submission in full here.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI问责 Anthropic NTIA AI评估 AI安全 红队测试 可解释性研究
相关文章