Fundamentals 4 min read

Why Do You See “锟斤拷” in Text? Uncover the Encoding Mystery

This article explains how character encoding works, using ASCII, Unicode, UTF‑8 and GBK examples to reveal why the garbled string “锟斤拷” appears when mismatched encodings are processed, and shows the underlying byte‑level transformations.

macrozheng

Feb 8, 2021

Why Do You See “锟斤拷” in Text? Uncover the Encoding Mystery

What is the mysterious “锟斤拷”?

In computing, every character is represented by a binary code. The article explains that encoding is simply a mapping from symbols to binary numbers.

ASCII example

For instance, the ASCII code 0100 0001 (decimal 65) corresponds to the letter A.

The Unicode replacement character � (U+FFFD, 65533) is used when a decoder encounters an unknown byte sequence.

Why “锟斤拷” appears

When a UTF‑8 byte array such as new byte[] {-25, -119, -25, -116} cannot be decoded, the decoder substitutes the replacement character, which is displayed as “�”.

In GBK, the same six‑byte sequence 0xEFBFBDEFBFBD is split into three two‑byte characters: 0xEFBF, 0xBDEF, 0xBFBD, which correspond to the Chinese characters “锟”, “斤”, and “拷”.

Thus the garbled “锟斤拷” you often see is the result of mismatched encoding between UTF‑8 and GBK.

Now you know the reason behind those strange symbols.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

software development Unicode UTF-8 character encoding ASCII GBK

Written by

macrozheng

Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.