6 min read

Qwen3‑30B‑A3B‑Instruct‑2507: New Instruction Model with Boosted General and Multilingual Skills

The Qwen3‑30B‑A3B‑Instruct‑2507 model, an updated non‑thinking version of Qwen3‑30B‑A3B, delivers significant gains in instruction following, reasoning, multilingual knowledge coverage, and 256K context length, and its performance is benchmarked against leading LLMs across a wide range of tasks.

Baobao Algorithm Notes

Jul 29, 2025

Qwen3‑30B‑A3B‑Instruct‑2507: New Instruction Model with Boosted General and Multilingual Skills

Highlights

General capabilities see a marked improvement in instruction following, logical reasoning, text understanding, mathematics, science, programming, and tool use.

Long‑tail multilingual knowledge coverage is substantially enhanced.

Better alignment with user preferences on subjective and open‑ended tasks, producing more helpful and higher‑quality responses.

Enhanced 256K token context length support.

Model Overview

Qwen3‑30B‑A3B‑Instruct‑2507 is a causal language model trained in two stages (pre‑training and post‑training). It has 30.5 B total parameters, 33 B activation parameters, 299 B non‑embedding parameters, 48 layers, a GQA attention configuration (Q=32, KV=4), 128 experts with 8 active experts per token, and native support for a 262 144‑token context window.

Important note: The model only supports the non‑thinking mode; the <think></think> block will never be generated, and the enable_thinking=False flag is no longer required.

Performance Overview

The model was evaluated on a broad suite of benchmarks, including MMLU‑Pro, MMLU‑Redux, GPQA, SuperGPQA, AIME25, HMMT25, ZebraLogic, LiveBench, LiveCodeBench, MultiPL‑E, Aider‑Polyglot, IFEval, Arena‑Hard, Creative Writing, WritingBench, and various domain‑specific TAU and MultiIF tests. Across most metrics, Qwen3‑30B‑A3B‑Instruct‑2507 matches or exceeds the original Qwen3‑30B‑A3B and often outperforms other leading models such as Deepseek‑V3, GPT‑4o, and Gemini‑2.5‑Flash.

Quick‑Start Guide

The model is available on Hugging Face under the repository Qwen/Qwen3-30B-A3B-Instruct-2507. Use the latest transformers library (≥ 4.51.0) to load the tokenizer and model. Example code:

from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Qwen/Qwen3-30B-A3B-Instruct-2507"
# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto",
)

For environments with transformers<4.51.0, a compatibility error may occur; upgrading the library resolves the issue.

Instruction Tuning Qwen3 model release Mixture‑of‑Experts