Databases 7 min read

Understanding MySQL utf8 vs utf8mb4 and How to Store Emoji Correctly

This article explains why MySQL's legacy utf8 charset cannot store four‑byte characters such as emoji, demonstrates the resulting insertion error, and shows how switching the database, table, and column collations to utf8mb4 resolves the issue while also covering the historical reasons behind MySQL's limited utf8 implementation.

Laravel Tech Community
Laravel Tech Community
Laravel Tech Community
Understanding MySQL utf8 vs utf8mb4 and How to Store Emoji Correctly

When attempting to insert emoji characters directly into a MySQL table using the default utf8 charset, the operation fails with an error like Incorrect string value: '\xF0\x9F\x98\x93' for column 'NAME'. The article reproduces this problem with an example INSERT statement:

INSERT INTO `csjdemo`.`student` (`ID`, `NAME`, `SEX`, `AGE`, `CLASS`, `GRADE`, `HOBBY`) VALUES ('20', '陈哈哈😓', '男', '20', '181班', '9年级', '看片儿');

After changing the database, table, and column collations to utf8mb4, the same statement succeeds, allowing the emoji to be stored:

INSERT INTO `student` (`ID`, `NAME`, `SEX`, `AGE`, `CLASS`, `GRADE`, `HOBBY`) VALUES (null, '陈哈哈😓😓', '男', '20', '181班', '9年级', '看片儿');

MySQL's original utf8 charset is not true UTF‑8; it only supports up to three bytes per character, which excludes four‑byte characters like most emoji and some rare Chinese glyphs. The true UTF‑8 support was later added as the utf8mb4 charset in 2010.

The article also provides a brief history: MySQL adopted an early UTF‑8 standard (RFC 2279) that allowed up to six bytes, then deliberately limited its own utf8 to three bytes for performance reasons, resulting in a long‑standing incompatibility. Because changing the charset for existing data would be disruptive, MySQL never fixed the bug but instead introduced utf8mb4 as a separate charset.

Finally, the author urges developers to always use utf8mb4 (or equivalent) for new MySQL/MariaDB databases to avoid the hidden pitfalls of the legacy utf8 charset.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

EmojiencodingmysqlCharacter Setutf8mb4
Laravel Tech Community
Written by

Laravel Tech Community

Specializing in Laravel development, we continuously publish fresh content and grow alongside the elegant, stable Laravel framework.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.