Databases 7 min read

Why MySQL 8’s Default utf8mb4 Breaks Legacy utf8 Tables – A Deep Dive

Upgrading MySQL 5.6/5.7 databases to 8.0 can introduce charset mismatches between existing utf8 (utf8mb3) tables and newly created utf8mb4 tables, causing index loss and performance degradation that can be resolved by unifying the character set across all tables.

ITPUB
ITPUB
ITPUB
Why MySQL 8’s Default utf8mb4 Breaks Legacy utf8 Tables – A Deep Dive

Many projects still run on MySQL 5.6 or 5.7 with default charset=utf8 (utf8mb3). When these databases are upgraded to MySQL 8.0, new tables inherit the server default utf8mb4, creating a mismatch between old and new tables.

The article builds two sample tables to illustrate the problem. The orders table is created with

ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci

, while the payments table uses ENGINE=InnoDB DEFAULT CHARSET=utf8mb3:

CREATE TABLE `orders` (
  `ordernumber` varchar(200) CHARACTER SET utf8 NOT NULL,
  `orderDate` date NOT NULL,
  `requiredDate` date NOT NULL,
  `shippedDate` date DEFAULT NULL,
  `status` varchar(15) CHARACTER SET utf8 NOT NULL,
  `comments` text CHARACTER SET utf8,
  `customernumber` varchar(200) CHARACTER SET utf8 DEFAULT NULL,
  PRIMARY KEY (`ordernumber`),
  KEY `customerNumber` (`customernumber`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
CREATE TABLE `payments` (
  `customerNumber` varchar(200) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL,
  `checkNumber` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NOT NULL,
  `paymentDate` date NOT NULL,
  `amount` decimal(10,2) NOT NULL,
  PRIMARY KEY (`customerNumber`,`checkNumber`),
  KEY `idx_payment` (`paymentDate`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3;

Two LEFT JOIN queries are executed, swapping the driver table. The first query uses payments as the driver (utf8mb3) and the second uses orders as the driver (utf8mb4). EXPLAIN output shows that when the driver table’s charset differs from the joined table, MySQL cannot use the index and falls back to a hash join, dramatically increasing execution time.

Performance measurements confirm the impact: with payments as driver, execution time is about 700 ms; with orders as driver, it rises to roughly 1742 ms.

To resolve the issue, the article converts the tables so that both share the same charset. The commands used are:

ALTER TABLE orders CONVERT TO CHARACTER SET utf8;
ALTER TABLE payments CONVERT TO CHARACTER SET utf8mb4;

After conversion, the EXPLAIN plans show proper index usage and the execution time improves significantly, demonstrating that a unified charset (preferably utf8mb4) restores optimal query performance.

The key takeaway is that during migration from MySQL 5.7 to 8.0, many tables remain in utf8mb3 while new tables default to utf8mb4; without explicit conversion, mixed charsets can cause index loss and slower queries. Ensuring consistent charset across the schema, ideally utf8mb4, prevents these problems.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceSQLmysqlutf8mb4
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.