How 5 Engineers Built an Open‑Source Long‑Horizon Coding Agent in 14 Days that Outperforms Claude Code
A five‑person Xiaomi team created MiMo Code, an open‑source long‑horizon programming agent written in TypeScript, within two weeks; the paper details its three‑dimensional design—compute, memory, evolution—benchmark results that surpass Claude Code, and simple installation options.
Not Just Another AI Completion Tool
MiMo Code is positioned as a "long‑horizon programming Agent" built on OpenCode, written in TypeScript, and released under the MIT license. Unlike single‑turn code‑completion tools, it aims to handle complex tasks that require dozens or hundreds of sequential steps while maintaining state and avoiding drift.
Three Dimensions, Three Solutions
Compute: More Accurate Decisions MiMo Code introduces a Max Mode mechanism that generates five independent candidate solutions for the same problem and uses a judge model to select the best, consuming 4–5× more tokens but gaining a 10–20% accuracy boost. At the end of a task, a separate verification Agent checks the full dialogue history to confirm goal completion, preventing false‑positive finishes. Execution logic is migrated from natural‑language prompts to executable JavaScript, allowing precise control of branches, synchronization, and exception recovery without relying on model intent parsing.
Memory: Finite Window, Unlimited Tasks Because LLM context windows are limited, MiMo employs a Checkpoint + Segmented Cycle architecture. An independent sub‑Agent periodically extracts the current execution state into a checkpoint; when the primary Agent’s window approaches capacity, a new Cycle takes over, restoring context from the checkpoint. Checkpoints are triggered at 20%, 45%, and 70% of window usage to avoid mid‑context “focus loss.” Storage is layered into temporary session memory, cross‑session project memory, user‑level global preferences, and a full SQLite audit log for fault recovery.
Evolution: Getting Smarter Over Time Project memory is stored as Markdown files that record architectural decisions, user rules, and verified facts, and can be edited directly. Two automated processes run on different schedules: Dream (weekly) merges, deduplicates, and compresses records from multiple sessions into project memory; Distill (monthly) extracts frequently occurring work patterns into reusable skills, CLI scripts, and operational flows. In theory, the longer the agent is used, the more it understands the project and the higher its execution efficiency becomes.
Benchmark Results
MiMo Code was evaluated on three mainstream benchmarks:
SWE‑bench Verified: MiMo Code + MiMo‑V2.5‑Pro 82% vs Claude Code + Claude Sonnet 4.6 79%
SWE‑bench Pro: 62% vs 55%
Terminal Bench 2: 73% vs 69%
An A/B blind test involving 576 developers, 474 real repositories, and 1,213 task pairs showed comparable performance for tasks under 200 steps, but for tasks exceeding 200 steps MiMo Code achieved a win rate above 65%, aligning with its “long‑task” positioning.
Installation
MiMo Code can be installed via either of the following methods:
# One‑click install via curl
curl -fsSL https://mimo.xiaomi.com/install | bash
# Or install globally with npm
npm install -g @mimo-ai/cliThe first use includes a one‑month free trial, supports importing existing Claude Code configurations, and allows custom model integration with a context window of up to one million tokens.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
