China’s AI Models Enter the Token Era with 4.69 Trillion Weekly Tokens

In March 2026, Chinese AI large‑model APIs processed 4.69 trillion tokens per week, overtaking the United States, driven by cheap electricity, aggressive tech optimization, and self‑evolving models like MiniMax M2.7, which together lower AI adoption costs and reshape the global AI landscape.

Architect's Journey
Architect's Journey
Architect's Journey
China’s AI Models Enter the Token Era with 4.69 Trillion Weekly Tokens

According to OpenRouter, by March 15 2026 Chinese AI large‑model APIs reached a weekly token volume of 4.69 trillion tokens , surpassing the United States for the second consecutive week and marking the start of a "token era" for China’s AI industry.

What is the "token era"?

Tokens (词元) are the basic units of measurement for AI model computation; every user interaction or content generation consumes tokens.

A vivid analogy: if tokens are the "electricity" of the AI age, China has become the world’s largest "power generator".

At the China Development Forum on March 24, Liu Lihong disclosed that March’s token consumption exceeded 140 quadrillion , a 1,000‑fold increase over the entire year of 2024, equivalent to the text of 1,000 billion books or 2.5 billion Chinese online novels.

Why is China leading?

1. Extreme cost‑performance advantage

Two core factors drive the low token price: abundant, cheap electricity and continuous breakthroughs in chips, model architecture, and system optimization. For example, the MiniMax M2.5 model costs only a few tenths of comparable overseas models while delivering comparable performance.

2. Model self‑evolution capability

The newly released MiniMax M2.7 model showcases an "Agent Harness" system that lets the model participate in its own training and optimization. Reported benefits include:

Handles roughly 30‑50% of workload in certain R&D scenarios.

Achieves about 30% improvement on internal evaluation sets.

Reaches 56.22% correctness on the SWE‑Pro programming benchmark, matching GPT‑5.3‑Codex.

Scores 55.6% on the VIBE‑Pro code‑generation benchmark, nearly equal to Opus 4.6.

3. Broad explosion of application scenarios

Chinese AI models are penetrating many domains:

Office tasks: significant upgrades to Word, Excel, PPT and other complex editing capabilities.

Programming development: end‑to‑end project delivery, log analysis, and bug investigation.

Content creation: high instruction‑following rates in multi‑turn tasks.

Implications

For developers, token‑based pricing turns AI capability into a utility‑like service—"turn‑on‑and‑use"—removing concerns about expensive model calls and encouraging deeper product integration.

Enterprises can now bypass heavy upfront investments in compute and model training, opting instead for on‑demand API calls billed per token, which dramatically lowers the barrier to AI adoption.

Globally, the surge in Chinese token production reshapes AI power dynamics; whoever controls token supply and circulation gains strategic influence in the AI era, likened to controlling "oil".

Future outlook

Continued iterations of domestic models such as MiniMax M2.7 and Alibaba’s Qianwen, together with accelerating AI hardware deployments (e.g., Alibaba’s upcoming AI glasses), are shifting China from a "follower" to a "leader". 2026 may be remembered as the "China AI Year".

Artificial Intelligencelarge language modelsChinaMiniMaxToken Economy
Architect's Journey
Written by

Architect's Journey

E‑commerce, SaaS, AI architect; DDD enthusiast; SKILL enthusiast

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.