Artificial Intelligence 5 min read

Alibaba Unveils Qwen3‑Max‑Preview: First Trillion‑Parameter LLM and What It Means

Alibaba introduced the Qwen3‑Max‑Preview model, a trillion‑parameter LLM that boosts multilingual understanding, complex instruction handling, and tool use while cutting hallucinations, offers competitive benchmark scores, supports 262K context, and comes with tiered token‑based pricing that may limit broader adoption.

21CTO

Sep 8, 2025

Alibaba Unveils Qwen3‑Max‑Preview: First Trillion‑Parameter LLM and What It Means

On September 6, Alibaba released the Qwen3‑Max‑Preview model on the Qwen official website and OpenRouter, describing it as the most powerful language model in the Qwen series.

The model is now accessible through Qwen Chat, Alibaba Cloud API, OpenRouter, and Hugging Face’s AnyCoder tool.

Compared with the 2.5 series, Qwen3‑Max‑Preview shows significant improvements in Chinese and English comprehension, complex instruction following, and tool invocation, while dramatically reducing knowledge hallucinations, making the model smarter and more reliable.

The model boasts a parameter count of 1 trillion and leads the Arena‑Hard v2 benchmark leaderboard; it also achieved an 80.6 score on the AIME25 benchmark, demonstrating strong logical reasoning and promising new experiences for complex workflows and high‑quality open‑ended dialogues.

Alibaba Cloud adopts a token‑based tiered pricing scheme:

0‑32K tokens: $0.861 per million input tokens, $3.441 per million output tokens.

32K‑128K tokens: $143.4 per million input, $573.5 per million output.

128K‑252K tokens: $2.151 per million input, $8.602 per million output.

This pricing is cost‑effective for small tasks but becomes expensive for long‑running workloads.

Closed‑Source May Affect Adoption

Unlike earlier Qwen versions, this model is not open‑source; access is limited to API and partner platforms. This commercial focus could slow broader adoption in the research and open‑source communities.

Key Takeaways

First trillion‑parameter Qwen model, marking Alibaba’s largest and most advanced LLM to date.

Supports 262K context length with cached tokens, surpassing many commercial models in document and session handling.

Competitive benchmark performance, outperforming Qwen3‑235B and rivaling Claude Opus 4, Kimi K2, and Deepseek‑V3.1.

Despite not being marketed as a reasoning model, early results show strong structured reasoning on complex tasks.

Closed‑source, tiered‑pricing model offers affordable small‑task usage but high costs at larger contexts, limiting accessibility.

Conclusion

Qwen3‑Max‑Preview sets a new scale benchmark for commercial LLMs with its trillion‑parameter design, 262K context length, and strong benchmark results, showcasing Alibaba’s technical depth. However, its closed‑source nature and steep tiered pricing raise questions about how widely it will be adopted.