How to Find and Delete Duplicate Records in MySQL Efficiently
This article explains how to identify duplicate rows in a MySQL table using GROUP BY and HAVING, shows several SELECT queries to list duplicates, and provides multiple DELETE strategies—including sub‑queries and multi‑column handling—to safely remove excess records while keeping one copy.
When building a question bank, duplicate entries can cause the same exam question to appear multiple times, so it is necessary to identify and delete duplicate rows, keeping only one copy.
Single‑field operations
This example uses a table named dept (illustrated below).
Group introduction
Select 重复字段 From 表 Group By 重复字段 Having Count(*)>1Use GROUP BY to group rows by the duplicate column and HAVING to keep only groups whose count is greater than 1.
GROUP BY <column list>
HAVING <group condition>
This query returns rows where the column (e.g., dname) appears more than once.
There is no practical difference between COUNT(*) and COUNT(1); either can be used.
COUNT(*) returns the total number of rows, including rows where the column is NULL, while COUNT(column) counts only rows where the column is NOT NULL (default values are counted).
1. Query all duplicate rows
Select * From 表 Where 重复字段 In (Select 重复字段 From 表 Group By 重复字段 Having Count(*)>1)2. Delete all duplicate rows
Changing the above SELECT to DELETE directly causes an error.
DELETE<br/>FROM<br/> dept<br/>WHERE dname IN (<br/> SELECT dname<br/> FROM dept<br/> GROUP BY dname<br/> HAVING count(1)>1<br/>)Error:
[Err] 1093 - You can't specify target table 'dept' for update in FROM clauseThe error occurs because MySQL does not allow updating a table while selecting from the same table in a subquery (a form of deadlock).
Solution: first select the rows to be deleted into a temporary result set, then delete using that set.
3. Query extra duplicate rows (keep the smallest deptno )
a. First method
SELECT * FROM dept WHERE dname IN (SELECT dname FROM dept GROUP BY dname HAVING COUNT(1)>1) AND deptno NOT IN (SELECT MIN(deptno) FROM dept GROUP BY dname HAVING COUNT(1)>1)This works but can be slow.
b. Second method
SELECT * FROM dept WHERE deptno NOT IN (SELECT dt.minno FROM (SELECT MIN(deptno) AS minno FROM dept GROUP BY dname) dt)c. Third method (recommended in comments)
SELECT * FROM table_name AS ta WHERE ta.唯一键 <> (SELECT max(tb.唯一键) FROM table_name AS tb WHERE ta.判断重复的列 = tb.判断重复的列);4. Delete extra duplicate rows and keep one
a. First method
DELETE FROM dept WHERE dname IN (SELECT t.dname FROM (SELECT dname FROM dept GROUP BY dname HAVING count(1)>1) t) AND deptno NOT IN (SELECT dt.mindeptno FROM (SELECT min(deptno) AS mindeptno FROM dept GROUP BY dname HAVING count(1)>1) dt);b. Second method (same as query method b, but DELETE)
DELETE FROM dept WHERE deptno NOT IN (SELECT dt.minno FROM (SELECT MIN(deptno) AS minno FROM dept GROUP BY dname) dt);c. Third method (comment‑section recommendation)
DELETE FROM table_name AS ta WHERE ta.唯一键 <> (SELECT max(tb.唯一键) FROM table_name AS tb WHERE ta.判断重复的列 = tb.判断重复的列);Multiple‑field operations
If you can handle a single column, handling multiple columns is straightforward: just add the additional columns to the GROUP BY clause.
DELETE FROM dept WHERE (dname, db_source) IN (SELECT t.dname, t.db_source FROM (SELECT dname, db_source FROM dept GROUP BY dname, db_source HAVING count(1)>1) t) AND deptno NOT IN (SELECT dt.mindeptno FROM (SELECT min(deptno) AS mindeptno FROM dept GROUP BY dname, db_source HAVING count(1)>1) dt);Summary
Add indexes on columns that are frequently queried.
Replace * with only the columns you actually need.
Use IN when the outer table is small; use EXISTS when the outer table is large, as IN scans the entire inner set while EXISTS checks existence per row.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
