Databases 10 min read

Why MySQL’s utf8 Fails with Emojis and How utf8mb4 Solves It

This article explains the difference between MySQL’s utf8 and utf8mb4 character sets, why utf8 cannot store emojis or complex Chinese characters, and provides step‑by‑step examples showing how to configure tables and columns with utf8mb4 to avoid encoding errors.

Java Backend Technology
Java Backend Technology
Java Backend Technology
Why MySQL’s utf8 Fails with Emojis and How utf8mb4 Solves It

What Is a Character Set?

Characters include letters, symbols, emojis, numbers, etc. A character set is a collection of characters that can be represented, and each set defines a range of characters it can encode.

Computers store data as binary; the process of mapping characters to binary is called character encoding , and the reverse is character decoding .

Common Character Sets

ASCII – 128 characters, mainly English.

GB2312 – ~6,700 Chinese characters, does not cover rare or traditional characters.

GBK – Extension of GB2312, >20,000 Chinese characters.

GB18030 – Fully compatible with GB2312 and GBK, includes minority scripts and over 70,000 Chinese characters.

BIG5 – Focused on Traditional Chinese, ~13,000 characters.

Unicode & UTF‑8 – Aim to cover virtually all known characters.

Using the wrong encoding to view a file causes garbled text; for example, interpreting GB2312‑encoded data with UTF‑8 yields nonsense characters.

MySQL Character Sets

MySQL supports many encodings such as UTF‑8, GB2312, GBK, BIG5. You can list them with the SHOW CHARSET command.

It is recommended to use UTF‑8 as the default, but MySQL provides two UTF‑8 implementations: utf8: Supports 1‑3 bytes per character. Chinese characters use 3 bytes, while emojis and many complex characters require 4 bytes and therefore cannot be stored. utf8mb4: Full UTF‑8 implementation supporting up to 4 bytes, capable of storing emojis and all Unicode characters.

If you need to store emojis or complex Chinese characters, set the database/table/column charset to utf8mb4 instead of utf8 to avoid errors.

Demonstration (MySQL 5.7+)

Creating a table with utf8mb4 charset:

CREATE TABLE `user` (
  `id` varchar(66) CHARACTER SET utf8mb4 NOT NULL,
  `name` varchar(33) CHARACTER SET utf8mb4 NOT NULL,
  `phone` varchar(33) CHARACTER SET utf8mb4 DEFAULT NULL,
  `password` varchar(100) CHARACTER SET utf8mb4 DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

Inserting a row that contains emojis while the table uses utf8 results in an error:

INSERT INTO `user` (`id`,`name`,`phone`,`password`) VALUES
('A00003','guide哥😘😘😘','181631312312','123456');

MySQL reports:

Incorrect string value: '\xF0\x9F\x98\x98\xF0\x9F...' for column 'name' at row 1

Changing the charset to utf8mb4 resolves the issue.

References

Charset & Encoding: https://www.cnblogs.com/skynet/archive/2011/05/03/2035105.html

Character set basics: http://cenalulu.github.io/linux/character-encoding/

Unicode Wikipedia: https://zh.wikipedia.org/wiki/Unicode

GB2312 Wikipedia: https://zh.wikipedia.org/wiki/GB_2312

UTF‑8 Wikipedia: https://zh.wikipedia.org/wiki/UTF-8

GB18030 Wikipedia: https://zh.wikipedia.org/wiki/GB_18030

ASCII character encoding
ASCII character encoding
UTF-8 vs utf8mb4
UTF-8 vs utf8mb4
MySQL charset diagram
MySQL charset diagram
MySQL SHOW CHARSET
MySQL SHOW CHARSET
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

EmojiencodingmysqlCharacter Setutf8mb4
Java Backend Technology
Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.