AI Algorithm Path
AI Algorithm Path
Sep 14, 2025 · Artificial Intelligence

Qwen3-Next: Achieving Unmatched Training and Inference Cost‑Effectiveness

Alibaba's Qwen team unveils Qwen3-Next, a hybrid expert LLM with 800 B parameters but only 30 B active, delivering training costs under one‑tenth of comparable dense models and more than ten‑fold inference throughput for long contexts, while matching or surpassing larger models on benchmark tasks.

AILLMMulti‑Token Prediction
0 likes · 9 min read
Qwen3-Next: Achieving Unmatched Training and Inference Cost‑Effectiveness
Baobao Algorithm Notes
Baobao Algorithm Notes
Sep 10, 2025 · Artificial Intelligence

Qwen3-Next Unveiled: Sparse MoE, Hybrid Attention & Multi‑Token Prediction

A recent Hugging Face pull request reveals Alibaba’s upcoming Qwen3‑Next series, highlighting its extreme‑context, parameter‑efficient design that combines a 1:50 high‑sparsity MoE, a hybrid attention architecture mixing gated attention with Gated DeltaNet, and a Multi‑Token Prediction technique, promising ten‑fold throughput gains for 32K‑plus token contexts.

AI ArchitectureHybrid attentionMulti‑Token Prediction
0 likes · 8 min read
Qwen3-Next Unveiled: Sparse MoE, Hybrid Attention & Multi‑Token Prediction