Claude Sonnet 4.6: Million‑Token Context, Human‑Level Computer Skills, Near‑Opus Performance

Claude Sonnet 4.6, Anthropic’s latest model, introduces a beta‑stage million‑token window and markedly better coding, computer‑use and long‑context reasoning, scoring 72.5% on OSWorld versus 14.9% for Sonnet 3.5, while offering Excel connectors, dynamic search filtering, stronger prompt‑injection resistance, and a pricing tier that makes it a strong alternative to Opus for many workloads.

AI Engineering
AI Engineering
AI Engineering
Claude Sonnet 4.6: Million‑Token Context, Human‑Level Computer Skills, Near‑Opus Performance

Claude Sonnet 4.6 launched today as Anthropic’s most powerful Sonnet model, featuring a beta‑stage million‑token context window and upgrades in coding, computer‑use and long‑context reasoning.

Computer‑Operation Capability Leap

OSWorld benchmark gives Sonnet 4.6 a score of 72.5%, nearly five times the 14.9% achieved by Sonnet 3.5 in October 2024. Early users report near‑human performance on complex spreadsheets and multi‑step web forms, enabling AI‑driven automation of legacy systems without custom APIs.

Practical Value

For Claude in Excel users, the new MCP connector lets the model pull data directly from S&P Global, LSEG, PitchBook and other financial sources without leaving Excel. The web‑search and retrieval tools now support dynamic filtering, automatically generating and executing code to pre‑process results, which improves accuracy by 11% and cuts token usage by 24%.

Anthropic states the model is more resistant to prompt‑injection attacks, reflecting its security‑first strategy.

In real‑world feedback, developers say they choose Sonnet 4.6 over Sonnet 4.5 in 70% of coding tasks and find it superior to Opus 4.5 in 59% of cases.

Pricing and Availability

Sonnet 4.6 is now available across all Claude plans, Claude Cowork, Claude Code and the API, including for free‑tier users, with added file creation, connectors, skills and compression features. Community members note its strong performance‑to‑price ratio makes it a compelling choice for budget‑conscious developers, while some still prefer Opus 4.6 for the deepest reasoning tasks.

LLMAI codingAPIbenchmarkClaudeSonnet 4.6computer automation
AI Engineering
Written by

AI Engineering

Focused on cutting‑edge product and technology information and practical experience sharing in the AI field (large models, MLOps/LLMOps, AI application development, AI infrastructure).

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.