Newsroom Anthropic 09月13日
Claude 2.1模型更新,提升企业级AI应用能力
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Claude 2.1模型在API上正式发布,将为企业级AI应用带来显著提升。新模型拥有200K token的上下文窗口,大幅减少模型幻觉率,并推出工具使用beta功能,允许与现有系统和API集成。这些改进使Claude 2.1能处理更复杂文档,提高准确性和可靠性,助力企业构建更高效的AI解决方案。

🔍 200K上下文窗口:Claude 2.1将上下文窗口扩展至200K token(约15万字),支持上传完整代码库、财务报表或长篇文学作品,使模型能处理更大体量内容,实现高效摘要、问答、趋势预测和文档对比等功能。

🛡️幻觉率减半:通过优化算法和事实核查机制,Claude 2.1在复杂问题上的错误陈述率降低50%,对法律文件、财务报告等高精度文档的准确率提升30%,显著增强输出可靠性。

🛠️工具使用beta:新增工具集成功能,允许Claude调用开发者定义的函数、API或数据库,实现计算推理、API调用、数据检索等操作,大幅扩展应用场景和交互能力。

⚙️系统提示功能:引入系统提示机制,用户可通过自定义指令指导模型特定角色、行为或响应格式,优化交互效率和结果一致性。

💡API优化:推出开发者工作台(Workbench),提供沙盒式提示测试环境、多项目管理功能和代码片段生成,简化API集成流程,加速模型调优。


Our latest model, Claude 2.1, is now available over API in our Console and is powering our claude.ai chat experience. Claude 2.1 delivers advancements in key capabilities for enterprises—including an industry-leading 200K token context window, significant reductions in rates of model hallucination, system prompts and our new beta feature: tool use. We are also updating our pricing to improve cost efficiency for our customers across models.

200K Context Window

Since our launch earlier this year, Claude has been used by millions of people for a wide range of applications—from translating academic papers to drafting business plans and analyzing complex contracts. In discussions with our users, they’ve asked for larger context windows and more accurate outputs when working with long documents.

In response, we’re doubling the amount of information you can relay to Claude with a limit of 200,000 tokens, translating to roughly 150,000 words, or over 500 pages of material. Our users can now upload technical documentation like entire codebases, financial statements like S-1s, or even long literary works like The Iliad or The Odyssey. By being able to talk to large bodies of content or data, Claude can summarize, perform Q&A, forecast trends, compare and contrast multiple documents, and much more.

Processing a 200K length message is a complex feat and an industry first. While we’re excited to get this powerful new capability into the hands of our users, tasks that would typically require hours of human effort to complete may take Claude a few minutes. We expect the latency to decrease substantially as the technology progresses.

2x Decrease in Hallucination Rates

Claude 2.1 has also made significant gains in honesty, with a 2x decrease in false statements compared to our previous Claude 2.0 model. This enables enterprises to build high-performing AI applications that solve concrete business problems and deploy AI across their operations with greater trust and reliability.

We tested Claude 2.1’s honesty by curating a large set of complex, factual questions that probe known weaknesses in current models. Using a rubric that distinguishes incorrect claims (“The fifth most populous city in Bolivia is Montero”) from admissions of uncertainty (“I’m not sure what the fifth most populous city in Bolivia is”), Claude 2.1 was significantly more likely to demur rather than provide incorrect information.

Claude 2.1 has also made meaningful improvements in comprehension and summarization, particularly for long, complex documents that demand a high degree of accuracy, such as legal documents, financial reports and technical specifications. In our evaluations, Claude 2.1 demonstrated a 30% reduction in incorrect answers and a 3-4x lower rate of mistakenly concluding a document supports a particular claim.


While we are encouraged by these accuracy improvements, enhancing the precision and dependability of outputs for our users remains a top priority for our product and research teams.

API Tool Use

By popular demand, we’ve also added tool use, a new beta feature that allows Claude to integrate with users' existing processes, products, and APIs. This expanded interoperability aims to make Claude more useful across our users’ day-to-day operations.

Claude can now orchestrate across developer-defined functions or APIs, search over web sources, and retrieve information from private knowledge bases. Users can define a set of tools for Claude to use and specify a request. The model will then decide which tool is required to achieve the task and execute an action on their behalf, such as:

    Using a calculator for complex numerical reasoningTranslating natural language requests into structured API callsAnswering questions by searching databases or using a web search APITaking simple actions in software via private APIsConnecting to product datasets to make recommendations and help users complete purchases

Tool use is currently in early development—we are building developer features and prompting guidelines for easier integration into your applications. We encourage users to share feedback on tool use to help shape and improve the product.

Developer Experience

We’ve been working to simplify our developer Console experience for Claude API users while making it easier to test new prompts for faster learning. Our new Workbench product enables developers to iterate on prompts in a playground-style experience and access new model settings to optimize Claude’s behavior. They can create multiple prompts and navigate between them for different projects, and revisions are saved as they go to retain historical context. Developers can also generate code snippets to use their prompts directly in one of our SDKs.

We’re also introducing system prompts, which allow users to provide custom instructions to Claude in order to improve performance. System prompts set helpful context that enhances Claude’s ability to take on specified personalities and roles or structure responses in a more customizable, consistent way aligned with user needs.

Claude 2.1 is available now in our API, and is also powering our chat interface at claude.ai for both the free and Pro tiers. Usage of the 200K token context window is reserved for Claude Pro users, who can now upload larger files than ever before. We can't wait to see the use cases these new features inspire as we work to build the safest and most technically sophisticated AI systems in the industry.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Claude 2.1 AI模型 上下文窗口 企业级AI 工具集成
相关文章