Efficient Ops
Aug 27, 2025 · Artificial Intelligence
Why DeepSeek V3.1 Randomly Inserts the Chinese Character “极” – Token Bug Explained
DeepSeek’s latest V3.1 model unexpectedly injects the Chinese character “极” into generated text, a token‑ID mix‑up that breaks code compilation, JSON parsing, and academic writing, with users tracing the issue to adjacent token IDs and two main hypotheses of dataset contamination or model shortcut.
AI safetyDeepSeeklanguage model
0 likes · 4 min read
