Anthropic发布Claude Sonnet 4.5，挑战OpenAI的GPT-5

Anthropic launched Claude Sonnet 4.5 on Monday, positioning the artificial intelligence model as "the best coding model in the world" in a direct challenge to OpenAI's recently released GPT-5, as the two AI giants battle for dominance in the lucrative enterprise software development market.

The San Francisco-based startup claims its newest model achieves state-of-the-art performance on critical coding benchmarks, scoring 77.2% on SWE-bench Verified — a rigorous software engineering evaluation — compared to GPT-5's performance. More remarkably, Anthropic says Claude Sonnet 4.5 can maintain focus on complex, multi-step tasks for more than 30 hours, a dramatic leap in AI's ability to handle sustained work.

"Sonnet 4.5 achieves 77.2% on SWE-bench Verified (82% with parallel test-time compute). It is SOTA," an Anthropic spokesperson told VentureBeat, using industry shorthand for "state of the art." The company also highlighted the model's 50% score on Terminal-bench, another coding benchmark where it claims leadership.

The announcement follows mounting pressure from OpenAI's recent advances and pointed criticism from high-profile figures like Elon Musk, who recently posted on X.com that "winning was never in the set of possible outcomes for Anthropic." When asked about Musk's statement, Anthropic declined to comment.

The release arrives just seven weeks after OpenAI's GPT-5 launch in August, underscoring the breakneck pace of competition in artificial intelligence as companies race to capture enterprise customers increasingly relying on AI for software development. The timing is particularly noteworthy as Anthropic grapples with questions about its heavy dependence on just two major customers.

Anthropic dominates coding market despite customer concentration risks

The competition centers on a market that has emerged as AI's first major profitable use case beyond chatbots. Anthropic commands 42% of the code generation market — more than double OpenAI's 21% share — according to a Menlo Ventures survey of 150 enterprise technical leaders. That dominance has translated into remarkable financial performance, with the company reaching a $5 billion revenue run rate earlier this year.

However, industry analysis reveals that coding applications Cursor and GitHub Copilot drive approximately $1.4 billion of Anthropic's revenue, creating a potentially dangerous customer concentration that could leave the company vulnerable if either relationship falters.

"Our run-rate revenue has grown significantly, even when you exclude these two customers," the Anthropic spokesperson said, pushing back on concerns about customer concentration. The company provided supportive quotes from both Cursor CEO Michael Truell and GitHub Chief Product Officer Mario Rodriguez praising Claude Sonnet 4.5's performance.

The new model achieves significant advances in computer use capabilities, scoring 61.4% on OSWorld, a benchmark that tests AI models on real-world computer tasks. Just four months ago, Claude Sonnet 4 held the lead at 42.2%, demonstrating rapid improvement in AI's ability to interact with software interfaces.

OpenAI's aggressive pricing strategy threatens Anthropic's premium positioning

Anthropic's announcement comes as the company grapples with competitive pressure from GPT-5's aggressive pricing strategy. Early analysis shows Claude Opus 4 costing roughly seven times more per million tokens than GPT-5 for certain tasks, creating immediate pressure on Anthropic's premium positioning.

The pricing disparity signals a fundamental shift in competitive dynamics that could force enterprise procurement teams to reconsider vendor relationships previously built on performance rather than price. Companies managing exponentially growing AI budgets now face comparable capability at a fraction of the cost.

Yet Anthropic is maintaining its pricing strategy with Claude Sonnet 4.5. "Sonnet 4.5's cost remains the same as Sonnet 4," the spokesperson confirmed, keeping prices at $3 per million input tokens and $15 per million output tokens.

Claude Sonnet 4.5 delivers 30-hour autonomous work sessions and enhanced security

Beyond performance improvements, Anthropic positions Claude Sonnet 4.5 as its "most aligned frontier model yet," showing significant reductions in concerning behaviors like sycophancy, deception, and power-seeking tendencies. The company has made "considerable progress on defending against prompt injection attacks," a critical security concern for enterprise deployments.

The model is being released under Anthropic's AI Safety Level 3 (ASL-3) protections, which include classifiers designed to detect potentially dangerous inputs and outputs related to chemical, biological, radiological, and nuclear weapons. While these safeguards sometimes flag normal content, Anthropic says it has reduced false positives by a factor of ten since initially describing them.

Perhaps most significantly for developers, Anthropic is releasing the Claude Agent SDK — the same infrastructure that powers its Claude Code product. "We built Claude Code because the tool we needed didn't exist yet," the company said in its announcement. "The Agent SDK gives you the same foundation to build something just as capable for whatever problem you're solving."

International expansion accelerates as $1.5 billion copyright settlement finalizes

The model launch coincides with Anthropic's aggressive international expansion, as the company seeks to diversify beyond its U.S.-concentrated customer base. The startup recently announced plans to triple its international workforce and expand its applied AI team fivefold in 2025, driven by data showing that nearly 80% of Claude usage now comes from outside the United States.

However, the expansion comes amid significant legal costs. Anthropic recently agreed to pay $1.5 billion in a copyright settlement with authors and publishers over allegations the company illegally used their books to train AI models without permission. The settlement, approved by a federal judge last week, requires payments of $3,000 for each publication listed in the case.

Enterprise AI spending doubles as companies prioritize performance over cost

The rapid-fire model releases from both companies reflect the high stakes in enterprise AI adoption. Model API spending has more than doubled to $8.4 billion in just six months, according to Menlo Ventures, as enterprises shift from experimental projects to production deployments.

Customer behavior patterns suggest enterprises consistently prioritize performance over price, upgrading to the newest models within weeks of release regardless of cost. This behavior could work in Anthropic's favor if Claude Sonnet 4.5's performance advantages prove compelling enough to overcome GPT-5's pricing advantage.

However, the dramatic price differential introduced by GPT-5 could overcome typical switching inertia, especially for cost-conscious enterprises facing budget pressures. Industry observers note that model switching costs remain relatively low, with 66% of enterprises upgrading within existing providers rather than switching vendors.

For enterprises, the intensifying competition delivers better performance and lower costs through continuously improving capabilities. The rapid pace of model improvements — with new versions launching monthly rather than annually — provides organizations with expanding AI capabilities while vendors compete aggressively for their business.

While the corporate rivalry between Anthropic and OpenAI dominates industry headlines, the real economic impact extends far beyond Silicon Valley boardrooms. The development of AI systems capable of sustained coding work for 30 hours represents a fundamental shift in how software gets built, with implications that extend across every industry relying on technology infrastructure.

These advancing capabilities signal broader workplace transformation ahead. As AI systems demonstrate increasing proficiency at complex, sustained intellectual work, the technology industry's competition for coding supremacy foreshadows similar disruptions across fields requiring analytical thinking, problem-solving, and technical expertise.

Anthropic dominates coding market despite customer concentration risks

OpenAI's aggressive pricing strategy threatens Anthropic's premium positioning

Claude Sonnet 4.5 delivers 30-hour autonomous work sessions and enhanced security

International expansion accelerates as $1.5 billion copyright settlement finalizes

Enterprise AI spending doubles as companies prioritize performance over cost

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签