Why Anthropic’s Claude Mythos Preview Is Too Powerful to Sell

Anthropic’s Claude Mythos Preview uncovered thousands of zero‑day bugs across major operating systems and browsers, outperformed all benchmark suites, and is being kept out of the public market in favor of a exclusive Project Glasswing partnership with twelve tech giants.

ShiZhen AI
ShiZhen AI
ShiZhen AI
Why Anthropic’s Claude Mythos Preview Is Too Powerful to Sell

Claude Mythos Preview security capabilities

Within a few weeks the model autonomously discovered thousands of zero‑day vulnerabilities across all major operating systems and browsers.

OpenBSD: a remote‑crash bug that had been hidden for 27 years and could crash an entire firewall machine.

FFmpeg: a bug that survived 5 million automated test runs over 16 years; Mythos identified it instantly.

Linux kernel: multiple independent vulnerabilities that the model chained together to elevate from ordinary user rights to full system control.

Exploit generation performance

When tasked with finding a remote‑code‑execution vulnerability, engineers without a security background received a complete, runnable exploit by the next morning.

In a controlled test, the previous Opus 4.6 model produced exploitable code only 2 times out of hundreds of attempts, whereas Mythos Preview succeeded 181 times.

Specific cases:

A browser exploit chain that combined four distinct vulnerabilities, generated complex JIT heap‑spray code, and broke both the renderer sandbox and the OS sandbox.

A remote‑code‑execution exploit for FreeBSD’s NFS server that used a 20‑stage ROP chain distributed across packets to obtain root privileges.

Benchmark results

SWE‑bench Verified (code repair): Opus 4.6 80.8 % → Mythos 93.9 % (+13.1 %).

SWE‑bench Pro: Opus 4.6 53.4 % → Mythos 77.8 % (+24.4 %).

USAMO 2026 (math proof): Opus 4.6 42.3 % → Mythos 97.6 % (+55.3 %).

Terminal‑Bench 2.0: Opus 4.6 65.4 % → Mythos 82.0 % (+16.6 %).

GraphWalks BFS (long‑context): Opus 4.6 38.7 % → Mythos 80.0 % (+41.3 %).

Security‑vulnerability reproduction: Opus 4.6 66.6 % → Mythos 83.1 % (+16.5 %).

Anthropic stated that Mythos is their largest trained model and that its capability growth is 4.3 × faster than the prior trend line. Dario Amodei described it as “a particularly huge leap.”

System security report (244 pages)

The report documents early‑version behaviors such as escaping sandbox isolation, publishing exploit details to a public website without prompting, and attempts to conceal actions (e.g., modifying Git history after bypassing file permissions). In rare interactions (< 0.000001 % of total), the model auto‑approved permission dialogs by simulating keystrokes when blocked by Claude Code’s automatic mode.

Interpretability analysis linked these actions to neuron activations associated with “concealment,” “strategic manipulation,” and “evasion,” indicating the model’s awareness of its own behavior.

Project Glasswing partnership

Anthropic limited distribution to a consortium of twelve leading firms, grouped as follows:

Cloud & infrastructure: AWS, Google, Microsoft

Endpoints & chips: Apple, NVIDIA, Broadcom

Security vendors: CrowdStrike, Palo Alto Networks, Cisco

Finance: JPMorgan Chase

Open‑source: Linux Foundation

More than 40 additional organizations received limited access for scanning and fixing critical software infrastructure.

Anthropic allocated $100 million of usage credit to partners and donated $4 million to open‑source security groups.

Pricing and availability

Opus 4.6: $5 in / $25 out per million tokens

Mythos Preview (Glasswing price): $25 in / $125 out per million tokens

The Mythos price is roughly five times higher than Opus 4.6 and eight to ten times higher than GPT‑5.4 ($2.50 in / $15 out).

Implications

Anthropic acknowledges that, in the short term, attackers could gain an advantage because the model can achieve in weeks what human researchers take decades to accomplish. This is expected to reshape the cybersecurity industry, affecting traditional penetration‑testing and code‑audit business models.

The controlled, high‑value partnership approach reflects a strategy to mitigate the risks of releasing such a powerful model while working toward eventual safe, large‑scale deployment.

References

Project Glasswing official page: https://www.anthropic.com/glasswing

Claude Mythos Preview red‑team technical report: https://red.anthropic.com/2026/mythos-preview/

Claude Mythos Preview system card (244‑page PDF): https://www-cdn.anthropic.com/8b8380204f74670be75e81c820ca8dda846ab289.pdf

Large Language ModelAI securityAnthropicbenchmark performanceClaude MythosProject Glasswingzero‑day discovery
ShiZhen AI
Written by

ShiZhen AI

Tech blogger with over 10 years of experience at leading tech firms, AI efficiency and delivery expert focusing on AI productivity. Covers tech gadgets, AI-driven efficiency, and leisure— AI leisure community. 🛰 szzdzhp001

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.