Qwen 3.5 Launches on New Year’s Eve as DeepSeek Only Sends a Holiday Greeting
On Chinese New Year's Eve, Alibaba's Qwen 3.5 open‑source model—featuring a 397 billion‑parameter backbone with a 17 billion‑parameter active set, hybrid linear attention, and sparse MoE—was released under Apache 2.0, delivering 8.6‑19× faster inference, top‑tier agent, code and multimodal scores, and rapid integration across major AI platforms.
Alibaba’s Qwen team quietly released the open‑source Qwen 3.5 series on New Year’s Eve, announcing the first open‑weight model Qwen3.5‑397B‑A17B on Hugging Face.
397B Body, 17B Soul
397B : total parameters close to 400 billion, a true giant.
A17B : only 17 billion activation parameters are used during inference.
The model combines Hybrid Linear Attention with Sparse MoE , giving it the knowledge of a 400 billion‑parameter brain while activating just 17 billion neurons for a specific query.
Result: decoding throughput is 8.6–19× higher than the previous Qwen3‑Max, while maintaining strong reasoning ability.
Performance vs. Closed Models
Agent ability (BFCL V4) : 72.9, surpassing GPT‑5.2 (63.1) and Claude Opus 4.5 (67.7).
Code ability (SWE‑bench Verified) : 76.4, slightly below GPT‑5.2 (80.0) but ahead of Qwen3‑Max‑Thinking.
Native multimodal (Video‑MME) : 87.5, showing excellent video‑understanding thanks to its built‑in visual module.
Scaling Law Still Holds
The team published a scaling‑law chart demonstrating that the hybrid architecture continues to improve performance steadily as model size grows.
Apache 2.0 License
The model is released under Apache 2.0, meaning it is completely free for commercial use.
Front‑end Demo: One Prompt Generates a 3D Car Game
In a showcase, a single natural‑language command produced a fully runnable 3D racing game.
Real‑time scoring : scores displayed at the top‑left.
Lap counter : tracks completed laps.
Timer : millisecond‑precision timing.
Speedometer : dynamic speed readout at the bottom‑right.
Integration with OpenClaw: Search, Think, Report
"Help me search for AI models released in the past month, compile a report, and generate a PDF."
Automatic search : calls a search tool to retrieve latest model releases (Claude Sonnet 5, Kimi K2.5, GPT‑5.3‑Codex‑Spark, DeepSeek V3.2, Gemini 3, etc.).
Information integration : structures the results into a report with model name, release date, key features, and benchmark data.
PDF generation : uses a document‑generation tool to export a 70 KB PDF.
The final PDF includes detailed model information, professional layout, and data visualizations.
Multimodal Agent: See → Think → Search → Report
Qwen 3.5‑Plus combines native multimodal vision with agent capabilities.
In a demo, a user uploads a screenshot from the movie “Gone with the Wind” and asks, “This film looks familiar—can you introduce it?”
Step 1: Visual analysis of the image.
Step 2: Search to verify information.
Step 3: Generate a concise report.
The closed loop “see → think → search → generate report” showcases strong multimodal reasoning and tool‑calling.
Ecosystem Blossoms: Rapid Multi‑Platform Support
OpenRouter – API Aggregation Platform
OpenRouter integrated Qwen 3.5 within six hours, allowing developers to call the model via a unified API without self‑hosting.
Poe – Conversational Platform
Quora’s Poe added Qwen 3.5‑Plus (1 M‑token context) alongside GPT‑5.2‑Codex, Gemini‑3‑Pro, Claude‑Opus‑4.6, and Kimi‑K2.5, highlighting its competitive capability.
Ollama – Local Deployment Tool
Run the model locally with a single command: ollama run qwen3.5:cloud The model is also accessible via Ollama’s cloud service for immediate testing.
NVIDIA – Hardware Giant Support
NVIDIA Build : free build via build.nvidia.com/qwen/qwen3.5-3....
NVIDIA NeMo : supports download and custom deployment from the GitHub repository.
NVIDIA’s backing adds hardware‑level optimization and eases enterprise adoption.
Overall, Qwen 3.5’s release is more than a model update; it sparks a collective celebration across the open‑source AI ecosystem, offering extreme inference speed, leading agent ability, and native multimodal interaction.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
