Artificial Intelligence 5 min read

MiMo V2.5 API Gets Permanent Price Cut and Token Plan Overhaul – Incentive Program Ends

MiMo announces a permanent up to 99% price reduction for its V2.5 API, a 5‑8× usage boost in its Token Plan billing, a full reset of all Token Plan quotas, and the conclusion of its Hundred‑Trillion Token Creator Incentive Program, effective May 27, 2026.

Xiaomi Tech

May 26, 2026

MiMo V2.5 API Gets Permanent Price Cut and Token Plan Overhaul – Incentive Program Ends

After months of offering MiMo Orbit and the Hundred‑Trillion Token Creator Incentive Program to let users experience MiMo and solve real problems, the team now implements a permanent overhaul of the model pricing system.

Key announcements

MiMo‑V2.5 series API price permanently reduced (up to 99% discount, no longer differentiated by context window length).

Token Plan billing optimized, allowing 5‑8× higher usage without additional cost.

The Hundred‑Trillion Token Creator Incentive Program has successfully concluded.

All active Token Plan user credits will be fully reset.

Effective time: 00:00 Beijing time on May 27, 2026 (global rollout).

API permanent price cut

The new pricing removes context‑window length tiers and can lower costs by as much as 99% compared with the original rates.

Token Plan billing optimization

The revised plan follows a "more usage, same price" principle, boosting usable token volume by 5‑8×. For example, in Agent or Code scenarios the token allowance is significantly increased (exact numbers are provided in the original plan details).

The billing rules have been simplified to be clearer and more intuitive, delivering a "what you see is what you get" experience.

Incentive program wrap‑up

Since its launch on April 28, the Hundred‑Trillion Token Creator Incentive Program attracted global participation; by May 26, 100 T tokens were fully distributed ahead of schedule, and the program ended successfully. Apache Software Foundation members retain a long‑term benefit that is unaffected by this closure.

Token Plan quota reset

All subscribed users with a valid Token Plan, including participants of the incentive program and Apache benefit users, will have their credit quotas reset at 00:00 on May 27 and will be charged under the new billing rules. Expired paid users will also receive a surprise gift announced within the next week.

Inference technology optimization

The price adjustment is underpinned by continuous inference system improvements from the Xiaomi technology team. Leveraging SGLang HiCache with full Sliding Window Attention support, KV‑cache data movement across GPU memory, CPU memory, and SSD is reduced to roughly 1/7 of its previous volume, while the number of cacheable tokens increases by about 5×, markedly improving cache hit rates and inference efficiency.

Additional gains come from an optimized expert‑parallel scheme and input‑length bucketing strategy, which raise cluster input throughput and lower per‑token service cost without compromising service quality. A detailed blog on these optimizations will be published soon.

Conclusion

The ultimate value of technology lies in its breadth of adoption. By delivering low‑cost, high‑performance model services, MiMo aims to drive widespread, sustained, and scalable inference demand, advancing the entire AI infrastructure ecosystem.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

inference optimization AI infrastructure SGLang MiMo API pricing Token Plan

Written by

Xiaomi Tech

Chat about technology with Xiaomi and change life together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.