Fundamentals 10 min read

Why Does Text Turn into Gibberish? Mastering Encoding and Fixing Garbled Characters

This article explains why text files can become unreadable garbled characters, explores binary representation, locale settings, and common character sets like ANSI and Unicode, and provides practical command‑line techniques such as using iconv to convert between encodings and avoid display issues.

ITPUB
ITPUB
ITPUB
Why Does Text Turn into Gibberish? Mastering Encoding and Fixing Garbled Characters

What is garbled text?

When a text file displays unreadable symbols or alien‑like characters, the phenomenon is called garbled text.

Numbers, plain English letters, and half‑width symbols usually remain readable.

Problems often appear when a file created locally is transferred to a remote server (Linux, AIX, HP‑UX, etc.).

The same binary content can be interpreted differently depending on the system's default encoding.

How characters are displayed

Computers store and transmit data as binary (0 or 1). The operating system reads the binary bytes, looks up the corresponding character in an encoding table (the character set), and renders the glyph using the selected font.

On macOS, the locale settings determine the default encoding:

ChenJacksons-MacBook-Pro:~ jackson$ locale
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=

Creating a UTF‑8 file and viewing it:

echo '话费余额查询' > utf8.txt
more utf8.txt
话费余额查询
xxd utf8.txt
0000000: e8af 9de8 b4b9 e4bd 99e9 a29d e69f a5e8 ...

Converting the file to GBK and displaying it shows garbled output:

iconv -f utf8 -t gbk utf8.txt > gbk.txt
more gbk.txt
?   // unreadable characters
xxd gbk.txt
0000000: bbb0 b7d1 d3e0 b6ee b2e9 d1af 0a ...
iconv -f gbk -t utf8 gbk.txt
话费余额查询

Encoding fundamentals

The mapping between binary numbers and characters is called an encoding . The collection of symbols is a character set . Two major character sets are:

ANSI (ASCII‑based) – covers basic Latin letters, digits, and common punctuation.

Unicode – a universal set that includes virtually every written symbol, including Chinese, Cyrillic, Thai, and historic scripts.

Common encodings derived from these sets include ASCII, UTF‑8, GB2312, GBK, GB18030, and UTF‑16.

Analyzing a mixed‑encoding example

echo 'a话费余额查询b' | iconv -f utf8 -t gbk | xxd
0000000: 61bb b0b7 d1d3 e0b6 eeb2 e9d1 af62 0a   a............b.

Explanation: 61 (ASCII ‘a’) is interpreted as an ASCII byte. bb starts a GBK multibyte sequence representing the Chinese character ‘话’. 0a is the line‑feed character, interpreted as ASCII.

Practical guidelines to avoid garbled text

Ensure the source and destination systems use the same encoding (check locale).

When transferring files, preserve the binary unchanged (use tar, zip, or binary‑mode FTP).

Install fonts that cover the required character ranges on both ends.

If the source is GBK and the target expects UTF‑8, convert beforehand with iconv -f gbk -t utf8 input.txt > output.txt.

On remote machines, you can also run iconv directly, provided you know the original and desired encodings.

Work‑arounds for small files

Open the file in a terminal emulator that supports the correct encoding (e.g., CRT) and copy‑paste the content.

Use graphical editors with explicit encoding options, such as gedit --encoding=gbk file.txt, then save with --encoding=utf8.

For larger files, these manual methods become impractical, so automated conversion with iconv or packaging the file before transfer is recommended.

Conclusion

Garbled text arises from mismatched encodings between the file’s binary representation and the system’s expected character set. By verifying locale settings, using consistent encodings, and converting when necessary, you can keep text readable across local and remote environments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

GBKgarbled textIConvlocalecharacter-set
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.