Claude’s New Fable 5 Model Unleashed: Explosive Performance but Double the Cost
The weekly tech roundup covers Anthropic’s flagship Claude Fable 5 and Mythos 5 models—showing record‑high benchmark scores but a two‑fold price increase—while also reviewing GPT‑5.6’s internal tests, Meshy’s world‑first 3D Agent, Kimi Work’s local AI assistant, Tencent Cloud’s Agent strategy, token‑cost cuts for overseas AI teams, Apple’s side‑AI breakthrough, and the HRM‑Text model that challenges scaling laws.
Anthropic announced on June 10 the release of Claude Fable 5, the first public‑facing Mythos‑level model, alongside Claude Mythos 5, which remains limited to trusted partners via Project Glasswing. Fable 5 delivers record performance across software‑engineering, knowledge‑work, visual‑understanding and scientific‑research benchmarks, scoring 80.3% on SWE‑Bench Pro (vs. Claude Opus 4.8’s 69.2% and GPT‑5.5’s 58.6%) and 29.3% on FrontierCode Diamond (vs. GPT‑5.5’s 5.7%). In a real‑world case, Stripe migrated a 50 million‑line Ruby codebase in one day with Fable 5, a task that previously required two months of engineering effort.
Pricing for both Fable 5 and Mythos 5 is set at $10 per million input tokens and $50 per million output tokens—approximately twice the cost of Claude Opus 4.8. Anthropic offers a 13‑day free window (until June 22) for Pro, Max, Team and enterprise subscribers; after June 23 the models will be removed from standard subscriptions and require paid usage.
Anthropic also implements multiple safety guards: a built‑in classifier automatically downgrades conversations involving high‑risk domains (e.g., cybersecurity, biochemistry) to Claude Opus 4.8, and a hidden degradation mechanism reduces answer quality for suspected pre‑training or distributed‑training queries. AI policy expert Nathan Lambert criticized this undisclosed quality reduction as “wrong”.
OpenAI’s next‑generation flagship GPT‑5.6 is in internal testing, with two checkpoint versions—kepler and kindle—competing for release candidacy. Early tests show Kindle’s UI generation lagging behind kepler, suggesting a trade‑off between stability and raw capability. A brief appearance of a model named “Levi” raised speculation about its origin.
Meshy unveiled the world’s first 3D Agent, a conversational system that guides users from vague ideas to printable 3D assets. The Agent iteratively proposes concept styles, generates models, checks printability (e.g., detecting 28 holes in a cat model), repairs issues, and exports to slicer software such as Bambu Studio and Creality Print, supporting formats like FBX, OBJ, GLB, STL and 3MF. In game‑asset creation, the Agent maintains style consistency across entire asset sets.
Kimi Work, built on the Kimi K2.6 backbone, offers a local, multi‑Agent workspace for knowledge workers. It clusters up to 300 sub‑Agents for parallel task execution, uses a WebBridge to control browsers (login, click, scrape), and provides a Skill marketplace with pre‑installed data sources (e.g., World Bank, Tonghuashun). The system limits file‑system access and requires permission for privileged actions, addressing security concerns.
Tencent Cloud presented a three‑pillar “Agent” strategy—scene‑connection, engineering‑control, and model‑driving—highlighting products such as MAGIC AI (marketing), Cloud Mall 2.0 (agent‑orchestrated commerce), WAND (media AI), AICC Trusted Cluster (TEE‑based secure inference), Lighthouse (lightweight server), CFS Turbo (high‑performance storage), and Agent Runtime (elastic scheduling with zero‑trust access). The company emphasizes that real‑world AI adoption requires proven scenarios, context and reliable systems.
Overseas AI teams are re‑evaluating inference costs: switching from H100 to RTX PRO 6000 (FP4 precision) can yield 1.63× higher throughput per dollar, cutting per‑million‑token cost from $4.5 to $1.8 in a sentiment‑assistant case, a 60% reduction that turns unprofitable projects into profit.
Apple’s side‑AI race sees a 4 B‑parameter cognitive model (New Cheng Alpha) matching GPT‑5.4 quality on collective‑intelligence tasks while running on MacBook and embodied devices, addressing the “token‑cost” bottleneck highlighted at WWDC.
Finally, Sapient Intelligence’s HRM‑Text model demonstrates a new efficiency path: a 1 B‑parameter architecture with recursive internal updates (8 × per token), MagicNorm normalization and a PrefixLM mask achieves 56.2 on MATH, 84.5 on GSM8K and 81.9 on ARC‑Challenge using only $1.5 k of compute (16 H100 GPUs for <2 days) and 400 billion unique tokens—challenging traditional scaling laws and inspiring follow‑up work such as the GRAM paper from Yoshua Bengio’s team.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ZhongAn Tech Team
China's first online insurer. Through tech innovation we make insurance simpler, warmer, and more valuable. Powered by technology, we support 50 billion RMB of policies and serve 600 million users with smart, personalized solutions. ZhongAn's hardcore tech and article shares are here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
