Machine Heart
May 19, 2026 · Artificial Intelligence
100k‑Token Natural‑Language Reasoning Enables a 30B‑A3B Model to Reach Olympiad Gold Level
A 30B‑A3B model, trained with reverse‑perplexity supervised fine‑tuning, two‑stage reinforcement learning, and a multi‑round generate‑verify‑revise inference loop, achieves gold‑medal performance on IMO, USAMO and IPhO contests using over 100 k token natural‑language reasoning without external tools.
30B-A3Bnatural language processingolympiad AI
0 likes · 11 min read
