Qwen3‑30B‑A3B‑Instruct‑2507: New Instruction Model with Boosted General and Multilingual Skills
The Qwen3‑30B‑A3B‑Instruct‑2507 model, an updated non‑thinking version of Qwen3‑30B‑A3B, delivers significant gains in instruction following, reasoning, multilingual knowledge coverage, and 256K context length, and its performance is benchmarked against leading LLMs across a wide range of tasks.
Highlights
General capabilities see a marked improvement in instruction following, logical reasoning, text understanding, mathematics, science, programming, and tool use.
Long‑tail multilingual knowledge coverage is substantially enhanced.
Better alignment with user preferences on subjective and open‑ended tasks, producing more helpful and higher‑quality responses.
Enhanced 256K token context length support.
Model Overview
Qwen3‑30B‑A3B‑Instruct‑2507 is a causal language model trained in two stages (pre‑training and post‑training). It has 30.5 B total parameters, 33 B activation parameters, 299 B non‑embedding parameters, 48 layers, a GQA attention configuration (Q=32, KV=4), 128 experts with 8 active experts per token, and native support for a 262 144‑token context window.
Important note: The model only supports the non‑thinking mode; the <think></think> block will never be generated, and the enable_thinking=False flag is no longer required.
Performance Overview
The model was evaluated on a broad suite of benchmarks, including MMLU‑Pro, MMLU‑Redux, GPQA, SuperGPQA, AIME25, HMMT25, ZebraLogic, LiveBench, LiveCodeBench, MultiPL‑E, Aider‑Polyglot, IFEval, Arena‑Hard, Creative Writing, WritingBench, and various domain‑specific TAU and MultiIF tests. Across most metrics, Qwen3‑30B‑A3B‑Instruct‑2507 matches or exceeds the original Qwen3‑30B‑A3B and often outperforms other leading models such as Deepseek‑V3, GPT‑4o, and Gemini‑2.5‑Flash.
Quick‑Start Guide
The model is available on Hugging Face under the repository Qwen/Qwen3-30B-A3B-Instruct-2507. Use the latest transformers library (≥ 4.51.0) to load the tokenizer and model. Example code:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Qwen/Qwen3-30B-A3B-Instruct-2507"
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto",
)For environments with transformers<4.51.0, a compatibility error may occur; upgrading the library resolves the issue.
Baobao Algorithm Notes
Author of the BaiMian large model, offering technology and industry insights.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
