Why ‘INSERT INTO SELECT’ Can Crash Your MySQL Migration and How to Fix It
A large‑scale MySQL table migration using INSERT INTO SELECT caused hidden full‑table scans and lock contention, leading to data loss and a costly OOM incident, but adding the proper index and understanding transaction isolation prevented the failure.
Background and Problem Statement
The company processes millions of transactions daily on a single MySQL table without sharding, so a data‑migration job was needed to keep the table performant while preserving recent data.
Proposed Solutions
Programmatically fetch rows, insert them into a history table, then delete the originals.
Use INSERT INTO SELECT so the database performs the whole operation.
The first approach OOMed when all rows were loaded into memory; batching reduced memory pressure but increased I/O and runtime, so the team chose the second approach. Tests on a staging environment passed, and the job was deployed at 20:00 nightly.
First Approach – Pseudo‑code and Failure Reason
// 1. Query data to migrate
List<Object> list = selectData();
// 2. Insert into history table
insertData(list);
// 3. Delete original rows
deleteByIds(ids);The OOM occurred because the code loaded the entire result set into memory before any deletion.
Second Approach – What Actually Happened
The job kept only the last ten days of data (≈10 k rows) and executed:
INSERT INTO history_table SELECT * FROM main_table WHERE datetime < NOW() - INTERVAL 10 DAY;In the test environment the query finished quickly, but in production the nightly run caused intermittent insert failures after midnight, resulting in missing payment records.
Root‑Cause Investigation
Disabling the migration task stopped the failures, indicating the job was the trigger. An EXPLAIN of the statement showed a full‑table scan:
A full scan on a large table makes the migration take about an hour, which explains why the issue only appeared during the night when the job ran for a long time.
When the WHERE clause was rewritten to use an indexed column, the plan switched to an index range scan and the problem disappeared:
Why Full‑Table Scan Causes Failures
Under MySQL’s default isolation level, INSERT INTO … SELECT locks the target table for the whole statement, while rows from the source table are locked row‑by‑row. A full scan forces MySQL to lock a huge number of rows, leading to lock wait timeouts and intermittent insert failures, especially when other transactions are accessing the payment‑flow table.
Why the Test Missed the Issue
The test used a realistic data set but did not simulate the concurrent high‑volume inserts that occur in production, nor the long‑running nature of the nightly job. Consequently, the lock contention and timeout behavior were not reproduced.
Solution and Best Practices
To keep using INSERT INTO SELECT safely:
Add an appropriate index on the columns used in the WHERE clause so the SELECT uses an index range scan instead of a full scan.
Verify the execution plan with EXPLAIN before deploying.
Consider breaking the migration into smaller batches if the table is still large.
Conclusion
‘INSERT INTO SELECT’ is powerful but must be used with proper indexing; otherwise, full‑table scans can lock the whole table, cause timeouts, and lead to data loss.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
