Databases 14 min read

Why MySQL Shows Garbled Text and How to Fix It

This article explains the root causes of MySQL character‑set garbling, details the encoding and decoding steps during data storage and retrieval, and provides practical methods—including dump/reload and binary conversion—to prevent and repair corrupted text.

ITPUB
ITPUB
ITPUB
Why MySQL Shows Garbled Text and How to Fix It

MySQL garbled text causes

Garbling occurs when the character set used by the client, the MySQL server (character_set_client), and the table charset are not consistent during either insertion or retrieval, leading to mismatched encoding/decoding steps.

Encoding process when storing data

The data flow includes:

Input method in the terminal produces characters.

The terminal encodes characters into a binary stream.

The binary stream is sent to the MySQL server via the client.

The server decodes the stream using character_set_client.

If the client charset differs from the table charset, a conversion from client‑charset to table‑charset is performed.

The converted binary stream is written to the table file.

Encoding process when retrieving data

The server reads the binary stream from the table file.

The stream is decoded using the table charset.

The data is re‑encoded to character_set_client.

The server encodes it back to a binary stream.

The stream is transmitted to the client.

The client displays the result using its configured charset.

Root causes of garbled text

Inconsistent charsets between storage and retrieval steps (e.g., UTF‑8 input, GBK client).

Inconsistent charsets within a single three‑step process (e.g., terminal UTF‑8, server GBK, table UTF‑8).

Encoding/decoding in MySQL

MySQL must decode incoming binary streams to parse SQL syntax and must decode stored binary data to allow character‑level operations such as SELECT LEFT(col,2). Without these steps, the storage engine cannot manipulate character data.

Lossless vs. lossy conversion

A conversion is lossless only if the source character exists in the target charset and the target charset can represent unsupported characters without data loss (e.g., Latin‑1 may preserve original bytes). UTF‑8 to GBK is often lossy because GBK lacks many Unicode characters.

How to avoid garbled text

Ensure the client, character_set_client, and table charset are identical. This three‑way consistency eliminates garbling.

How to repair corrupted data

Two correct approaches are recommended:

Method 1: Dump & Reload

Export the data using the original (incorrect) charset, create a new table with the correct charset, then import the dump.

mysqldump -u root -p -d --skip-set-charset \
    --default-character-set=utf8 test charset_test_latin1 > data.sql
# Verify the dump file is UTF‑8 encoded
mysql -uroot -p -e "CREATE TABLE charset_test_latin1 (id INT PRIMARY KEY AUTO_INCREMENT, char_col VARCHAR(50)) CHARSET=utf8" test
mysql -uroot -p --default-character-set=utf8 test < data.sql

Method 2: Convert to Binary & Convert Back

Change the column to a binary type to preserve raw bytes, then alter it back to the desired charset.

ALTER TABLE charset_test_latin1 MODIFY COLUMN char_col VARBINARY(50);
ALTER TABLE charset_test_latin1 MODIFY COLUMN char_col VARCHAR(50) CHARACTER SET utf8;

Incorrect methods to avoid

Using ALTER TABLE ... CHARSET=xxx only changes the default charset for future columns and does not affect existing data. Using ALTER TABLE ... CONVERT TO CHARACTER SET ... on a table that already contains garbled data can further corrupt the data.

encodingMySQLCharacter Setgarbled textcharset conversion
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.