Lessons Learned from Misusing INSERT INTO SELECT in MySQL Data Migration
A real‑world MySQL data‑migration case study shows how using INSERT INTO SELECT without proper indexing caused a full table scan, OOM errors, and data loss, and explains how to avoid the pitfall by adding appropriate indexes and understanding transaction locking.
1. Incident cause
The company processes millions of rows daily in a single MySQL table without sharding. To keep the table performant, a data‑migration task was created to move old records to a history table.
2. Proposed solutions
Query the data in application code, insert into the history table, then delete the original rows.
Use INSERT INTO SELECT to let MySQL handle the whole operation.
The first approach caused OOM when loading all data at once; batching reduced I/O but was slow, so the second approach was chosen and deployed.
3. What went wrong – detailed analysis
First solution (pseudocode)
// 1、查询对应需要迁移的数据
List<Object> list = selectData();
// 2、将数据插入历史表
insertData(list);
// 3、删除原表数据
deleteByIds(ids);The OOM was caused by loading the entire dataset into memory.
Second solution – the problematic INSERT INTO SELECT
The developer added a date filter to keep only the last 10 days of data (≈10 k rows) and executed:
INSERT INTO SELECT ... WHERE dateTime < (Ten days ago)
This avoided paging queries and seemed to work in testing, so the task was scheduled nightly at 20:00.
However, the next day the finance team found missing payment records. The failure started after the nightly job began, with insert operations intermittently failing.
Investigation showed the migration task caused the issue. Stopping the task eliminated the failures.
Examining the EXPLAIN of the INSERT INTO SELECT revealed a full table scan. The massive scan prolonged the migration, leading to lock contention and time‑outs on the payment table.
When the WHERE clause was adjusted to use an index, the full scan disappeared and the problem was resolved.
Root cause
In MySQL’s default isolation level, INSERT INTO a SELECT b locks table a entirely while locking rows of b one by one. This caused partial row‑level locks on the payment table, resulting in time‑outs and insert failures.
Why testing missed the bug
The test environment used realistic data but did not simulate the high‑volume concurrent inserts that occur in production, so the lock‑contention issue was not reproduced.
4. Solution
Avoid full table scans by adding appropriate indexes on the columns used in the WHERE clause so that the SELECT part of the statement uses an index.
5. Can INSERT INTO SELECT still be used?
Yes, but only when proper indexes are in place and the impact on locking is understood.
6. Conclusion
When using INSERT INTO SELECT for large‑scale data migration, always ensure the query is indexed to prevent full table scans, which can cause OOM, lock contention, and data loss.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.