How to Efficiently Remove Duplicate Rows in MySQL Tables
This article explains why a naïve Python script for deleting duplicate MySQL rows is too slow, demonstrates the MySQL error caused by deleting from the same table you query, and provides two pure‑SQL solutions: one that removes all duplicates and another that keeps a single row per duplicate key.
During an on‑call incident we needed to clean duplicate rows from several MySQL tables, some of which contained hundreds of thousands of records. A simple Python script that deleted rows one by one proved too slow, so we switched to pure SQL solutions.
Delete all duplicate rows (no rows kept)
Attempting to delete directly with a sub‑query on the same table causes MySQL error 1093 because the target table is also read in the FROM clause.
DELETE FROM student
WHERE name IN (
SELECT name FROM (
SELECT name FROM student GROUP BY name HAVING COUNT(1) > 1
) AS t
);This works by first materialising the list of duplicate names in a derived table.
Delete duplicates while keeping one row per name
First identify the rows to keep – the smallest id for each name – then delete everything whose id is not in that set.
DELETE FROM student
WHERE id NOT IN (
SELECT t.id FROM (
SELECT MIN(id) AS id FROM student GROUP BY name
) AS t
);The query runs quickly even on tables with more than 900 000 rows.
All done.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Backend Technology
Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
