DeepSeek‑V3.1‑Terminus Fixes the ‘Extreme’ Bug and Outperforms Gemini 2.5 Pro
DeepSeek released the V3.1‑Terminus model, fixing the notorious “extreme” character bug, improving language consistency and Agent capabilities, and achieving notable benchmark gains that surpass Gemini 2.5 Pro, while providing download links and hinting at upcoming V4/R2 releases.
DeepSeek‑V3.1‑Terminus released
DeepSeek announced the new model DeepSeek‑V3.1‑Terminus, which addresses the previously reported “extreme” character bug and improves language consistency, especially reducing mixed Chinese‑English output and occasional abnormal characters.
All official channels (App, web, mini‑program, and API) have been updated to the new model.
Download links:
Hugging Face: https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Terminus
ModelScope: https://modelscope.cn/models/deepseek-ai/DeepSeek-V3.1-Terminus
The update also enhances Agent capabilities, optimizing both Code Agent and Search Agent performance.
Benchmark results show significant gains over the previous version and over Gemini 2.5 Pro on several tasks, including a 36.48 % increase on Humanity’s Last Exam, improvements on MMLU‑Pro, GPQA‑Diamond, LiveCodeBench, SimpleQA, and various SWE‑bench scores.
Agent‑related scores also rose: BrowseComp 30.0 → 38.5, SimpleQA 93.4 → 96.8, SWE‑bench Verified 66.0 → 68.4, SWE‑bench Multilingual 54.5 → 57.8, Terminal‑bench 31.3 → 36.7.
The “extreme” bug that inserted the character “极” into outputs (e.g., turning time.Second into time.Se极) appears to be resolved; the new model provides five correct timer implementations without the anomaly.
Some benchmarks (Codeforces, Aider‑Polyglot, BrowseComp‑zh) showed slight declines, but overall performance is markedly better.
Speculation about the next model, DeepSeek‑V4/R2, is ongoing, with the community eagerly awaiting further announcements.
References:
https://x.com/deepseek_ai/status/1970117808035074215
https://huggingface.co/deepseek-ai/DeepSeek-V3.1-Terminus
https://modelscope.cn/models/deepseek-ai/DeepSeek-V3.1-Terminus
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
