Anthropic's new mid-tier model delivers near-flagship agentic performance at roughly half the cost.
Anthropic released Claude Sonnet 5 on Tuesday, a mid-tier AI model that matches its flagship Opus 4.8 on knowledge-work benchmarks while costing 60% less per token, intensifying the price war in enterprise AI as the company races toward its initial public offering.
"With Claude Sonnet 5, agents stay on plan, follow our conventions, and ship clean multi-step changes, all at an efficient cost," Sualeh Asif, co-founder of AI-powered code editor Cursor, said.
Sonnet 5 scores 63.2% on SWE-bench Pro, an agentic coding benchmark, up from Sonnet 4.6's 58.1% and within striking distance of Opus 4.8's 69.2%. On GDPval-AA v2, a knowledge-work evaluation, it surpassed the flagship model — 1,618 versus 1,615 — while pricing starts at $2 per million input tokens and $10 per million output tokens through August 31, compared with Opus 4.8's $5 and $25. The model uses an updated tokenizer that can expand input by 1.0 to 1.35 times depending on content, a change Anthropic said is calibrated to be "roughly cost-neutral" during the introductory period.
The launch comes as Anthropic barrels toward an IPO expected later this year, having filed its S-1 confidentially on June 1. The company reported a $47 billion revenue run rate after a $65 billion Series H in May at a $965 billion valuation, making the Sonnet tier's ability to convert experimental usage into production-scale revenue a critical metric for public-market investors.
Agentic capability becomes the new baseline
Sonnet 5's emphasis on autonomous task execution — planning, tool use, and multi-step workflow completion — reflects a broader shift across the AI industry. OpenAI launched GPT-5.6 Sol in preview last week with similar subagent capabilities, and Google's Gemini 3.5 Flash, released in May, was pitched as an agentic tool requiring minimal human input. The differentiator is no longer which company can build agentic models, but which can deliver them cheaply enough for production deployment at scale.
Early access partners reported that Sonnet 5 finishes complex tasks where previous models stalled. Daniel Shepard, a senior engineer at Zapier, said the model completed a two-part automation job — updating Salesforce account tiers and sending a launch announcement to enterprise contacts — that "used to stall halfway" with prior versions. On Terminal-Bench 2.1, another coding evaluation, Sonnet 5 scored 80.4% versus Sonnet 4.6's 67.0% and Opus 4.8's 82.7%.
Safety tradeoffs and the IPO narrative
Anthropic said Sonnet 5 shows lower rates of hallucination and sycophancy than its predecessor and is more resistant to prompt injection attacks. However, on a Firefox 147 exploit development evaluation created with Mozilla, Sonnet 5 showed a 13.2% partial success rate, up from Sonnet 4.6's 8.8%, though neither model produced a working exploit. Opus 4.8 scored 68.8% and the restricted Mythos 5 scored 88.4%. Anthropic launched Sonnet 5 with real-time cyber safeguards enabled by default, mirroring protections on Opus 4.7 and 4.8.
The pricing strategy also serves a dual purpose for Anthropic's IPO narrative. The company needs to demonstrate that its cheaper models can drive high-volume, recurring API revenue from thousands of enterprise customers — not just experimental usage from developers. Gil Luria, head of technology research at D.A. Davidson, told CNBC that while Anthropic "appears to have the lead in frontier AI models, much of their current usage is for trials and experimentation and that may not sustain."
Just yesterday, California Governor Gavin Newsom announced a partnership providing Claude to all state agencies at a 50% discount with free workforce training — the kind of durable, institutional adoption that could anchor recurring revenue. Anthropic faces competition from OpenAI, which raised $122 billion in March at an $852 billion valuation and is pursuing its own IPO, as well as Google, Meta, and well-funded Asian AI startups developing similar capabilities.
Anthropic shares are not yet publicly traded. The company's S-1 filing, when made public, will face scrutiny over whether the Sonnet tier — cheaper but high-volume — or the Opus tier — expensive but high-margin — drives the bulk of gross profit. As PitchBook analyst Harrison Rolfes told CNBC, the 2026 IPO window "either becomes the most consequential IPO cycle since the dot-com era or the most expensive lesson in narrative-versus-fundamentals that public markets have ever taught."
This article is for informational purposes only and does not constitute investment advice.