AI News 10月29日 21:17
OpenAI发布开源AI安全模型,赋能开发者定制内容审核
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

OpenAI推出了一系列名为“gpt-oss-safeguard”的开源AI安全模型,旨在让AI开发者能够更直接地控制内容审核。这些模型允许开发者根据自己的具体需求和政策,灵活地定制内容分类标准。不同于传统的固定规则,新模型利用推理能力实时解读开发者设定的策略,实现了更高的透明度和敏捷性。开发者可以深入了解模型的决策过程,并能快速迭代调整安全规则,无需进行大规模模型重训。这标志着AI安全控制正从平台提供者转向开发者自身,为构建更安全、更具适应性的AI应用奠定了基础。

✨ **开发者可控的内容安全审核**:OpenAI新推出的‘gpt-oss-safeguard’系列模型,允许AI开发者直接将安全控制措施集成到自身应用中。这些模型采用开放权重,并基于Apache 2.0许可证发布,意味着任何组织都可以自由使用、修改和部署它们,以满足其特定的内容分类需求,将安全决策权交还给开发者。

🧠 **基于推理的策略解读**:与预设固定规则的模型不同,‘gpt-oss-safeguard’模型能够利用其推理能力,在模型推理时动态解读开发者设定的具体安全策略。这意味着模型可以理解并执行开发者定制化的规则,对从用户提示到完整对话历史的任何内容进行分类,确保安全框架与业务逻辑紧密结合。

🔍 **提升透明度和敏捷性**:该方法的一大优势是透明度。模型采用“思维链”(chain-of-thought)过程,使开发者能够清晰地看到模型做出分类决策的逻辑,摆脱了传统“黑箱”分类器的限制。同时,由于安全策略并非永久训练到模型中,开发者可以即时迭代和修改其内容审核指南,无需进行耗时的模型重训,大大提高了安全策略的灵活性和响应速度。

🚀 **定制化安全标准**:通过引入这些开源模型,OpenAI打破了平台提供商强制执行的“一刀切”安全层。使用开源AI模型的开发者现在能够构建和执行自己独特的、符合特定用例和价值观的安全标准,从而构建更负责任、更具适应性的AI系统。

OpenAI is putting more safety controls directly into the hands of AI developers with a new research preview of “safeguard” models. The new ‘gpt-oss-safeguard’ family of open-weight models is aimed squarely at customising content classification.

The new offering will include two models, gpt-oss-safeguard-120b and a smaller gpt-oss-safeguard-20b. Both are fine-tuned versions of the existing gpt-oss family and will be available under the permissive Apache 2.0 license. This will allow any organisation to freely use, tweak, and deploy the models as they see fit.

The real difference here isn’t just the open license; it’s the method. Rather than relying on a fixed set of rules baked into the model, gpt-oss-safeguard uses its reasoning capabilities to interpret a developer’s own policy at the point of inference. This means AI developers using OpenAI’s new model can set up their own specific safety framework to classify anything from single user prompts to full chat histories. The developer, not the model provider, has the final say on the ruleset and can tailor it to their specific use case.

This approach has a couple of clear advantages:

    Transparency: The models use a chain-of-thought process, so a developer can actually look under the bonnet and see the model’s logic for a classification. That’s a huge step up from the typical “black box” classifier.
    Agility: Because the safety policy isn’t permanently trained into OpenAI’s new model, developers can iterate and revise their guidelines on the fly without needing a complete retraining cycle. OpenAI, which originally built this system for its internal teams, notes this is a far more flexible way to handle safety than training a traditional classifier to indirectly guess what a policy implies.

Rather than relying on a one-size-fits-all safety layer from a platform holder, developers using open-source AI models can now build and enforce their own specific standards.

While not live as of writing, developers will be able to access OpenAI’s new open-weight AI safety models on the Hugging Face platform.

See also: OpenAI restructures, enters ‘next chapter’ of Microsoft partnership

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security Expo, click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

The post OpenAI unveils open-weight AI safety models for developers appeared first on AI News.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

OpenAI AI Safety Open Source Content Moderation Developer Tools 人工智能安全 开源模型 内容审核 开发者工具
相关文章