Su San Talks Tech
Jul 1, 2026 · Artificial Intelligence
Which Domestic Multimodal LLM Is the Most Efficient for Production?
The article benchmarks three Chinese multimodal large models—Step 3.7 Flash, MiniMax M3, and Qwen 3.6‑flash—across two real‑world tasks, measuring output quality, API latency, and token cost, and concludes that Step 3.7 Flash consistently offers the best speed‑cost trade‑off for production use.
API latencyMiniMax M3Qwen 3.6 flash
0 likes · 10 min read
