Xiaomi Opens MiMo‑V2.5 and Gives 100 Trillion Free Tokens – A Must‑Grab
Xiaomi has open‑sourced its MiMo‑V2.5 series, including a 1.02 T‑parameter Pro model, and is giving developers up to 100 trillion free tokens for 30 days; the article details the models' token‑efficiency benchmarks, a macOS‑like demo, MIT‑license benefits, and step‑by‑step usage instructions.
MiMo‑V2.5 series
Two flagship models are released:
MiMo‑V2.5‑Pro : 1.02 T total parameters, 420 B active MoE parameters, 1 M token context.
MiMo‑V2.5 : 310 B total parameters, 15 B active MoE parameters, 1 M token context.
Both use a Mixture‑of‑Experts (MoE) sparse architecture that activates only a subset of parameters during inference, reducing compute cost while preserving capability.
Token‑efficiency benchmark (ClawEval Agent task)
MiMo‑V2.5‑Pro achieves the target result with approximately 70 k tokens . Competing models require 120‑180 k tokens:
Claude Opus 4.6 – 120‑180 k tokens (40‑60 % higher cost)
GPT‑5.4 – 120‑180 k tokens (40‑60 % higher cost)
Gemini 3.1 Pro – 120‑180 k tokens (40‑60 % higher cost)
The token reduction translates into roughly half the expense for large‑scale Agent workloads.
Long‑running Agent stability demo
MiMo‑V2.5‑Pro generated a fully functional macOS‑like desktop in 4 hours without human intervention. The generated system includes:
Boot animation, user login, window management
Dock scaling, Spotlight search, dark/light theme, Launchpad
54 native applications, including a working Safari browser
Calculator, calendar, maps, notes, 3‑D function grapher
Technical stack: React 18, TypeScript, Zustand, Tailwind CSS, Vite; 68 components; a window‑management state machine supporting drag, resize, z‑index layering, and macOS‑style three‑color light logic.
The demo also demonstrated stability over 1 000 tool calls with no memory loss or drift.
SysY compiler benchmark (Peking University)
Completion time: 4.3 hours
Tool invocations: 672
Score: 233/233 (full marks)
Execution remained uninterrupted, without drift or forgetting previous context.
This long‑duration Agent performance is noted as rare among open‑source models.
Comparison with other models
Agent/Code capability : MiMo‑V2.5‑Pro approaches Claude Opus 4.6; Claude Opus 4.7 is identified as the strongest verified model; Kimi K2.6 leads the global code leaderboard.
Token efficiency : MiMo‑V2.5‑Pro is described as significantly ahead of the competitors.
Context length : 1 M tokens (same as Claude Opus 4.7; Kimi does not disclose).
License : MiMo‑V2.5‑Pro is MIT‑licensed (open source); Claude Opus is closed; Kimi uses Apache 2.0.
Domestic availability : MiMo‑V2.5‑Pro is available in China; Claude Opus is not.
Consumer‑grade GPU deployment : Supported for MiMo‑V2.5‑Pro and Kimi; not supported for Claude Opus.
API price (input) : 7 CNY per million tokens for MiMo‑V2.5‑Pro versus ≈34 CNY per million tokens for Claude Opus.
Free 100 trillion token plan
The plan provides a total allocation of 100 trillion tokens over 30 days, distributed on an application basis. The highest tier (Max Plan) grants 1.6 billion credits, equivalent to about 659 CNY. Application steps:
Visit 100t.xiaomimimo.com and click “Apply”.
Complete the form with detailed project information.
Wait approximately three business days for evaluation.
Receive an email, log in to the MiMo API platform, and claim the credits.
Credits become available within 24 hours.
MIT license implications
The MIT license permits commercial use, fine‑tuning, redistribution, and does not require attribution or source disclosure, removing typical commercial restrictions.
Configuration in Claude Code (cc‑switch) and OpenClaw
Claude Code (cc‑switch) configuration:
Open the cc‑switch tool.
Select vendor “Xiaomi MiMo”.
Enter the API key obtained from platform.xiaomimimo.com.
Set model name to mimo-v2.5-pro.
Enable and use the model within Claude Code.
OpenClaw configuration:
Navigate to Settings → Model Configuration.
Add a custom vendor with Base URL https://api.xiaomimimo.com/v1.
Enter the API key.
Set model name to mimo-v2.5-pro.
Observed usage characteristics
Output quality described as “human‑like” and comparable to Claude Opus 4.6.
Low API latency, especially with early‑adopter user load.
Stable performance with 1 M token context; no memory drift over 1 000 tool calls.
Pricing rule changes (effective with the token plan)
Context billing unified; the previous 256 K/1 M dual‑rate distinction removed.
Pro model multiplier reduced from 4× to 2×.
Standard model multiplier reduced from 2× to 1×.
Night‑time discount (00:00–08:00 Beijing time) of 20 % applied to all models.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Lao Guo's Learning Space
AI learning, discussion, and hands‑on practice with self‑reflection
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
