Databases 7 min read

Why You Should Never Use MySQL “utf8” and Switch to “utf8mb4”

The article explains that MySQL’s legacy “utf8” charset only supports three‑byte characters, causing errors with genuine four‑byte UTF‑8 symbols, and advises all MySQL/MariaDB users to migrate to the proper “utf8mb4” charset using available conversion guides.

Architect's Guide
Architect's Guide
Architect's Guide
Why You Should Never Use MySQL “utf8” and Switch to “utf8mb4”

When attempting to store a UTF‑8 string containing an emoji in a MariaDB database configured with the "utf8" charset, an error such as "Incorrect string value" can occur because MySQL’s "utf8" is not true UTF‑8.

MySQL’s "utf8" charset limits each character to a maximum of three bytes, whereas the official UTF‑8 standard (RFC 3629) allows up to four bytes. This limitation prevents storage of many modern Unicode characters, including many emojis.

To address this, MySQL introduced the "utf8mb4" charset in 2010, which fully implements UTF‑8 by supporting four‑byte characters. The article strongly recommends that all MySQL and MariaDB users abandon the legacy "utf8" charset and switch to "utf8mb4".

The piece also provides a brief history: MySQL added UTF‑8 support in version 4.1 (2003) based on the older RFC 2279, which allowed up to six bytes per character. Later, developers deliberately restricted the charset to three bytes for performance reasons tied to fixed‑length CHAR columns, leading to the problematic "utf8" implementation.

Because the incorrect charset was widely documented, many developers continue to use it, resulting in wasted space, slower performance, and data loss for characters outside the three‑byte range.

For migration, the article points to a guide that explains how to convert existing databases from "utf8" to "utf8mb4": https://mathiasbynens.be/notes/mysql-utf8mb4#utf8-to-utf8mb4 .

Additional promotional notes mention a Spring Boot + MyBatis Plus + Vue 3.2 + Vite + Element Plus based blog project with source code links on GitHub and Gitee, but the core technical advice remains focused on proper character encoding.

DatabaseMySQLcharacter encodingutf8mb4MariaDB
Architect's Guide
Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.