Databases 13 min read

Why MySQL Unique Indexes Fail with NULL and How to Fix Them

This article explains why a unique index in MySQL can still allow duplicate rows when indexed columns contain NULL, explores the challenges of adding unique indexes to logically deleted tables, and presents practical solutions such as incremental delete status, timestamps, extra IDs, hash fields, and proper batch insertion techniques.

Java Backend Technology

Oct 22, 2022

Why MySQL Unique Indexes Fail with NULL and How to Fix Them

Introduction

Recently I encountered a problem: even after adding a UNIQUE index to an InnoDB table in MySQL 8, duplicate data still appeared. This article shares the experience and discusses interesting aspects of unique indexes.

1. Reproducing the Issue

To prevent duplicate product groups, I created a product_group_unique table with the following structure:

CREATE TABLE `product_group_unique` (
  `id` bigint NOT NULL,
  `category_id` bigint NOT NULL,
  `unit_id` bigint NOT NULL,
  `model_hash` varchar(255) COLLATE utf8mb4_bin DEFAULT NULL,
  `in_date` datetime NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin;

I added a unique index on (category_id, unit_id, model_hash) to guarantee uniqueness:

ALTER TABLE product_group_unique ADD UNIQUE INDEX ux_category_unit_model (category_id, unit_id, model_hash);

Although the index prevented duplicates when model_hash had a value, inserting rows where model_hash was NULL succeeded, resulting in duplicate records.

2. Unique Index Columns Containing NULL

MySQL treats NULL values as distinct for unique constraints, so rows with NULL in any indexed column are not considered duplicates. Therefore, the uniqueness guarantee fails when model_hash can be NULL.

When a unique index column allows NULL , MySQL’s uniqueness constraint may become ineffective.

3. Unique Index on Logically Deleted Tables

Logical deletion (using an UPDATE to set a delete_status flag) keeps rows in the table, which makes adding a unique index problematic because the deleted rows still occupy the unique key space.

Typical solutions include:

3.1 Incremental Delete Status

Instead of a binary flag, use an incrementing delete_status value (1, 2, 3, …). Each deletion increments the status, ensuring that the combination of business fields and delete_status remains unique.

3.2 Timestamp Field

Add a timestamp column and include it in the unique index. Each logical delete writes the current timestamp, guaranteeing uniqueness even for repeated deletions.

3.3 Additional ID Field

Introduce a separate delete_id column (e.g., the primary key of the row) and include it in the unique index alongside the business fields.

4. Adding a Unique Index When Historical Duplicates Exist

If a table already contains duplicate historical data, create a new “anti‑duplicate” table, migrate distinct rows, and then add the unique index to the original table after cleaning up duplicates. Alternatively, add a delete_id column, assign the maximum id to the first occurrence, and set subsequent duplicates to their own id, then create the unique index on the combined columns.

5. Unique Index on Large Columns

When a column (e.g., model) is too large for MySQL’s 1000‑byte unique‑key limit, consider:

5.1 Adding a Hash Column

Store a short hash of the large field and create the unique index on the hash together with other identifying columns. Be aware of possible hash collisions.

5.2 Not Adding a Unique Index

Rely on application‑level controls such as single‑threaded insertion or message‑queue processing to prevent duplicates.

5.3 Redis Distributed Lock

Generate a hash from the combination of fields and use it as a Redis lock key during insertion to avoid concurrent duplicates.

6. Batch Insertion

Instead of locking each row individually, use MySQL’s bulk INSERT with a unique index. The database will reject duplicate rows in a single statement, providing both simplicity and performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

mysql Database Design Logical Delete Unique Index NULL handling

Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.