Fundamentals 4 min read

Why Do You See “锟斤拷”? Unraveling Unicode, UTF‑8 and GBK Encoding Mysteries

This article explains why the garbled string “锟斤拷” appears in Chinese software, covering basic character encoding concepts, the Unicode replacement character, UTF‑8 decoding failures, and how GBK’s double‑byte scheme turns the placeholder bytes into the visible characters 锟、斤、拷.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Why Do You See “锟斤拷”? Unraveling Unicode, UTF‑8 and GBK Encoding Mysteries

What is the mysterious character?

In computing, every visible symbol is represented by a binary code. The simplest example is ASCII, where the binary 0100 0001 (decimal 65) maps to the letter A. Unicode also defines a special placeholder character (code point 0xFFFD, decimal 65533) that is used when a decoder encounters an unknown or unrepresentable byte sequence.

When a program cannot map a byte sequence to a known character, it substitutes this placeholder, which often shows up as a garbled box or the string “�”.

Why does “锟斤拷” appear?

Consider the byte array new byte[] {-25, -119, -25, -116}. In UTF‑8 this sequence does not correspond to any valid character, so the decoder replaces it with the Unicode placeholder . The placeholder itself is encoded in UTF‑8 as the three‑byte sequence 0xEFBFBD (decimal [-17, -65, -67]).

When the same three‑byte sequence is interpreted using the GBK encoding, which is a double‑byte scheme, it is split into three two‑byte units: 0xEFBF, 0xBDEF, and 0xBFBD. In GBK these correspond to the Chinese characters , , and respectively. Thus the placeholder that should appear as “�” is rendered as the visible string “锟斤拷”.

The phenomenon is common when text encoded in one charset (e.g., UTF‑8) is mistakenly read as another (e.g., GBK) without proper conversion, leading to the infamous “锟斤拷” garble.

Understanding the mapping between binary representations, Unicode code points, and specific legacy encodings like GBK helps prevent such mis‑display issues.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

placeholderUnicodeUTF-8character encodingGBKgarbled text
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.