Why DeepSeek V3.1 Randomly Inserts the Chinese Character “极” – Token Bug Explained

DeepSeek’s latest V3.1 model unexpectedly injects the Chinese character “极” into generated text, a token‑ID mix‑up that breaks code compilation, JSON parsing, and academic writing, with users tracing the issue to adjacent token IDs and two main hypotheses of dataset contamination or model shortcut.

Efficient Ops
Efficient Ops
Efficient Ops
Why DeepSeek V3.1 Randomly Inserts the Chinese Character “极” – Token Bug Explained
Bug illustration
Bug illustration

Several users reported that DeepSeek V3.1 suddenly inserts the Chinese character “极” (both simplified and traditional) into generated text without warning. This stray character causes compilation failures, JSON format errors, and undermines the rigor of academic writing.

The issue also appears, albeit less frequently, in DeepSeek’s official Playground, indicating it is not limited to third‑party API platforms.

Root Cause Hypotheses

Technical analysis shows that the token ID for “极” is 2577, while the token ID for the commonly used ellipsis “…” is 2576, making them adjacent in the model’s vocabulary.

1. Dataset Contamination

During data cleaning, some entries containing special or abnormal characters may not have been fully filtered, leaving the “极” token in the training set.

2. Model Shortcut

The model may have learned a shortcut during training, mistakenly selecting the neighboring token in certain contexts. Once triggered, the bug seems “addictive,” with the frequency of “极” increasing in subsequent interactions.

Impact Scope

Code generation: random Chinese characters cause compilation failures.

API calls: JSON and other structured outputs break.

Academic writing: precision and professionalism are compromised.

Tencent Cloud CodeBuddy has contacted the DeepSeek team and plans to include a fix in the next version.

DeepSeekAI safetylanguage modelmodel debuggingtoken bug
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.