Databases 13 min read

How I Rescued a MySQL Table After Massive Bad Data Insertion

When a user reported errors, I discovered a MySQL table flooded with millions of bad rows from a new group, causing slow queries and crashes; I detail the step‑by‑step investigation, failed attempts with DELETE and lock‑free changes, and the final successful rebuild using temporary tables, DROP, TRUNCATE, and index reconstruction.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How I Rescued a MySQL Table After Massive Bad Data Insertion

1. Introduction

As a Java developer I assumed MySQL was just about CRUD and simple DDL, but a production incident revealed deeper complexities. A new bad_group was imported by a DataWorks job, inserting tens of millions of rows, most of which were garbage, causing query timeouts and errors.

键盘撒一把冻干,我家猫也能来上班。——粥师傅

2. Problem Investigation and Fix Process

2.1 Initial Problem

The table schema is roughly (id, group, code, name, property1, property2, ...) with primary key on id and a unique key on group + code. The DataWorks job used INSERT IGNORE, so duplicate group+code rows were ignored.

When the bad data arrived, the first instinct was to delete all rows where group='bad_group': DELETE FROM MY_TABLE WHERE group = 'bad_group'; However, deleting tens of millions of rows exceeded the binlog limit and failed.

2.2 Lock‑Free Change Attempt

We tried the platform’s lock‑free change feature, which runs the same DELETE in batches. The estimated execution time was over two hours, so the operation was aborted.

2.3 Dropping and Re‑creating the Table

Since we only needed to keep less than 100 k valid rows, we copied the good data to a temporary table, dropped the original table, and renamed the temporary table back:

CREATE TABLE TEMP_TABLE AS SELECT * FROM MY_TABLE WHERE group <> 'bad_group';
DROP TABLE MY_TABLE;
RENAME TABLE TEMP_TABLE TO MY_TABLE;

This worked, but it also removed the primary key, unique key, and indexes, causing later import issues.

2.4 Realising Index Loss

After the drop, new rows were inserted with id=0 and duplicate group+code because the primary key and unique key were gone. The table still functioned for queries, but data integrity was compromised.

2.5 Alternative: TRUNCATE Instead of DROP

Using TRUNCATE TABLE MY_TABLE; preserves the table definition while clearing data:

CREATE TABLE TEMP_TABLE AS SELECT * FROM MY_TABLE WHERE group <> 'bad_group';
TRUNCATE TABLE MY_TABLE;
INSERT INTO MY_TABLE SELECT * FROM TEMP_TABLE;

2.6 Restoring Primary Key and Unique Key

To fix the broken table we copied rows with id=0 to a temporary table, deleted them from the source, added the missing indexes, and re‑inserted the rows with INSERT IGNORE:

CREATE TABLE TEMP_TABLE AS SELECT * FROM MY_TABLE WHERE id = 0;
DELETE FROM MY_TABLE WHERE id = 0;
ALTER TABLE MY_TABLE ADD INDEX ...;  -- add needed indexes
INSERT IGNORE INTO MY_TABLE SELECT * FROM TEMP_TABLE;

The re‑insertion still failed because the auto‑increment counter had been reset to zero by the earlier CREATE TABLE AS operation.

2.7 Fixing AUTO_INCREMENT

We restored the auto‑increment attribute explicitly:

ALTER TABLE MY_TABLE MODIFY COLUMN `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT COMMENT '自增ID';

After this change the table behaved normally.

3. Summary

The root cause was the misuse of CREATE TABLE ... AS, which copies data but drops primary keys, unique keys, and the AUTO_INCREMENT property. Dropping or truncating a table without understanding these side effects can break production systems. The lesson: always verify DDL side effects and prefer CREATE TABLE LIKE when you need an exact structural copy.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLindexingdatabasemysqltroubleshootingdata cleanupDDL
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.