Why DeepSeek V3.1 Keeps Spitting the ‘Extreme’ Token and How to Fix It
Developers using DeepSeek V3.1's API have reported that the model intermittently inserts the Chinese character “极” (or its variants) into generated code, a bug that spreads across multiple platforms and threatens high‑precision code generation, prompting community workarounds and speculation about its root causes.
DeepSeek V3.1 has sparked a wave of discussion after developers repeatedly observed the unexpected appearance of the character “极” (or its variants) in API‑generated code outputs.
Initially discovered on platforms such as Volcano Engine and Chutes, the issue quickly spread to other services, including Tencent's CodeBuddy and even the DeepSeek official interface.
The problematic token appears in three forms on Reddit and other forums:
“extreme” (id:15075)
“极” (id:2577, simplified Chinese for “extreme”)
“極” (id:16411, traditional Chinese for “extreme”)
On Tencent CodeBuddy the bug even manifested as an advertisement containing the “极” character.
The DeepSeek team has been contacted and plans to fix the issue in an upcoming release.
How to mitigate the “极” bug?
While awaiting an official fix, some community members suggest a prompt‑engineering workaround: prohibit a specific token pattern consisting of a space, several tokens, and a placeholder/ellipsis.
Prohibited pattern: [space] [few tokens] [placeholder/ellipsis]
This method is intended for third‑party platforms calling the API and is not required when using DeepSeek V3.1 directly.
Why does this happen?
According to AI researcher Huang Zhewei, the phenomenon resembles a “malicious pattern” observed in earlier small‑scale models (e.g., R1‑0528), where the model would terminate output with the character “极” after lengthy repetitions, effectively using it as a hidden stop token.
He hypothesizes that incomplete data cleaning during SFT or pre‑training introduced “极长” arrays into the training set. When the model employs Retrieval‑Augmented Generation (RAG) and later reinforcement learning, it may have learned to treat “极” as a termination marker.
If such dirty data persists through model iterations, it can “contaminate” normal generation, leading to the observed bug.
The issue will likely be resolved once DeepSeek releases a new version that addresses the underlying data‑cleaning problem.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
