Aug 27, 2025 · Artificial Intelligence

Why DeepSeek V3.1 Randomly Inserts the Chinese Character “极” – Token Bug Explained

DeepSeek’s latest V3.1 model unexpectedly injects the Chinese character “极” into generated text, a token‑ID mix‑up that breaks code compilation, JSON parsing, and academic writing, with users tracing the issue to adjacent token IDs and two main hypotheses of dataset contamination or model shortcut.

AI safetyDeepSeekLanguage Model

0 likes · 4 min read

Why DeepSeek V3.1 Randomly Inserts the Chinese Character “极” – Token Bug Explained

token bug

Why DeepSeek V3.1 Randomly Inserts the Chinese Character “极” – Token Bug Explained