Anthropic launches Claude Sonnet 5 at 60% less than Opus 4.8

Anthropic released Claude Sonnet 5 on Tuesday, a mid-tier model that matches or approaches its flagship Opus 4.8 on key benchmarks while costing 60% less per token, as agentic capability becomes the new baseline across the foundation model industry.

"It can make plans, use tools like browsers and terminals, and run autonomously at a level that, just a few months ago, required larger and more expensive models," Anthropic said in a blog post.

Sonnet 5 scores 63.2% on SWE-bench Pro for agentic coding, up from Sonnet 4.6's 58.1% and within striking distance of Opus 4.8's 69.2%. On the knowledge-work benchmark GDPval-AA v2, it surpassed the flagship, scoring 1,618 versus Opus 4.8's 1,615. Introductory API pricing is set at $2 per million input tokens and $10 per million output tokens through Aug. 31, after which it rises to $3 and $15 — still well below Opus 4.8's $5 and $25.

The launch comes as Anthropic barrels toward an IPO that will test whether private-market AI valuations can survive public scrutiny. The company reported a $47 billion revenue run rate after its Series H in May, but gross margins — a figure no outside observer has seen — will determine whether the narrative holds, according to PitchBook analyst Harrison Rolfes.

Agentic reliability closes the gap between pilot and production

Early access partners reported that Sonnet 5 finishes multi-step workflows where previous models stalled. Daniel Shepard, a senior engineer at Zapier, said the model completed a two-part automation job — updating Salesforce account tiers and sending a launch announcement — that "used to stall halfway" with earlier versions. Sualeh Asif, co-founder of Cursor, said that "with Claude Sonnet 5, agents stay on plan, follow our conventions, and ship clean multi-step changes, all at an efficient cost."

These testimonials address the reliability gap that has kept many enterprises from moving agentic AI from pilot programs into production. A model that completes the full workflow changes the economics of automation, particularly at Sonnet 5's price point. Anthropic introduced cost-performance curves showing developers can now adjust effort levels across Sonnet 5 and Opus 4.8 to find the optimal balance of cost and accuracy for specific use cases.

The release mirrors similar moves by competitors. OpenAI's GPT-5.6 Sol, launched in preview last week, allows users to split work across subagents for longer autonomous tasks. Google's Gemini 3.5 Flash, released in May, was pitched as a shift from conversational chatbot to agentic tool. The pattern confirms that agentic capability is now table stakes at every price tier, with the differentiator shifting to cost efficiency and reliability without human oversight.

Safety improves but lags behind the most capable models

Sonnet 5 shows lower rates of hallucination and sycophancy than Sonnet 4.6, is better at refusing malicious requests, and is more resistant to prompt injection attacks in agentic contexts, according to Anthropic's internal evaluations. On the company's automated behavioral audit, Sonnet 5 scored lower — meaning safer — overall than its predecessor.

However, it showed somewhat higher rates of misaligned behavior compared with Opus 4.8 and Claude Mythos Preview, Anthropic's tightly restricted cybersecurity model. On a Firefox 147 exploit development evaluation created with Mozilla, neither Sonnet model could develop a working exploit — both scored 0% — though Sonnet 5 showed a slightly higher partial success rate of 13.2% versus Sonnet 4.6's 8.8%. Opus 4.8 scored 68.8% and Mythos 5 scored 88.4%.

Because of these incremental gains, Anthropic launched Sonnet 5 with cyber safeguards enabled by default — real-time systems that detect and block dangerous cybersecurity usage. The safeguards mirror those on Opus 4.7 and 4.8 but are less restrictive than those applied to Fable 5 and Mythos 5.

One technical detail deserves attention: Sonnet 5 uses an updated tokenizer that changes how the model processes text, similar to the change Anthropic introduced with Opus 4.7. The same input can map to roughly 1.0 to 1.35 times as many tokens depending on content type. Anthropic says the introductory pricing is calibrated to make the transition "roughly cost-neutral," but enterprise customers running high-volume workloads will want to benchmark their specific use cases before assuming their bills won't change.

The IPO narrative and what Sonnet 5 means for investors

Anthropic's financial trajectory has been extraordinary. In February, it raised $30 billion at a $380 billion valuation with $14 billion in annualized revenue. By late May, it had closed a $65 billion Series H at a $965 billion post-money valuation with a revenue run rate above $47 billion. The company confidentially filed its IPO prospectus with the SEC in early June.

Sonnet 5 serves a dual purpose in this context. For developers, it offers genuine capability improvements at competitive prices. For Anthropic's IPO narrative, it demonstrates the company can deliver a compelling product at a price tier that could drive broad adoption — high-volume, recurring API revenue from thousands of enterprise customers. Gil Luria, head of technology research at D.A. Davidson, told CNBC that while Anthropic "appears to have the lead" in frontier AI models, "much of their current usage is for trials and experimentation and that may not sustain."

The real test for Sonnet 5 is whether it converts experimental usage into production-grade revenue. Enterprise customers experimenting with expensive Opus-class models may find that Sonnet 5 delivers sufficient quality for production workloads at a price point that finance teams can approve at scale. If it works, it could accelerate the shift from experimentation to deployment that every AI company needs to justify its valuation.

This article is for informational purposes only and does not constitute investment advice.