Artificial Intelligence 6 min read

Claude’s 1M‑Token Context Window Launches with No Premium Pricing

Anthropic’s Claude Opus 4.6 and Sonnet 4.6 now offer a full‑million‑token context window at the same per‑token price as short‑context usage, delivering top‑ranked MRCR v2 performance, six‑fold media capacity, and reduced AI‑Agent memory compression without any code changes across all major cloud platforms.

AI Explorer

Mar 14, 2026

Claude’s 1M‑Token Context Window Launches with No Premium Pricing

1. No Premium for Long Context: Uniform Pricing

Anthropic removed all extra fees for ultra‑long context windows. Opus 4.6 costs $5 per million input tokens and $25 per million output tokens; Sonnet 4.6 costs $3/$15 respectively. Whether a request uses 9 000 tokens or 900 000 tokens, the rate is identical, and the pricing is permanent standard pricing.

2. Performance Leads: MRCR v2 Benchmark

In the MRCR v2 (Multi‑needle Retrieval) benchmark, Opus 4.6 achieved a 78.3 % score—the highest among frontier models—by locating eight hidden “needles” within a 1 M‑token haystack. By contrast, Sonnet 4.5 scored only 18.5 % on the same test, representing more than a four‑fold improvement.

3. Media Capacity Jumps Six‑Fold

Single‑request support for images and PDF pages increased from a maximum of 100 pages per file to 600 pages per file, a six‑times boost that enables a single request to analyze an entire 600‑page contract, financial report, or technical document.

4. Real‑World AI Agent Benefits

Long context is especially valuable for AI Agents. After the upgrade to a 1 M‑token window, users reported a 15 % reduction in context‑compression events, allowing agents to run continuously for hours without forgetting information from the first page.

5. Software Engineering: Load Entire Codebase

Developers can now load an entire code repository within a single context window, enabling search, aggregation of edge‑case logic, and generation of fix suggestions without losing any initialization details that previously vanished during context compression.

6. Multi‑Platform Availability

The million‑token context is available on all major platforms without extra configuration: Claude.com/API, Microsoft Azure Foundry, Google Cloud Vertex AI, and Claude Code (Max/Team/Enterprise). No beta request headers are required, and the full window benefits from the same rate limits.

7. What It Means

By making long context a baseline infrastructure, AI usage shifts from pure question‑answering to collaborative workflows. Developers no longer need to trim prompts, perform manual RAG chunking, or worry about agents forgetting earlier steps.