How to Find and Delete Duplicate Records in SQL – Multiple Proven Techniques
This guide explains how to locate rows with identical values across one or several columns using self‑joins, GROUP BY, DISTINCT, and rowid, and provides step‑by‑step SQL statements to safely delete the extra duplicate entries from a table.
When a table (e.g., persons) contains rows where fields such as name, ID number, and address are exactly the same, you can retrieve those duplicate rows with a self‑join:
SELECT p1.*
FROM persons p1, persons p2
WHERE p1.id <> p2.id
AND p1.cardid = p2.cardid
AND p1.pname = p2.pname
AND p1.address = p2.address;The article then presents three common ways to delete duplicate rows.
1. Rowid method (Oracle)
Oracle provides a hidden ROWID column that can be used to keep the newest (or oldest) row and remove the others.
SELECT * FROM table1 a
WHERE ROWID <> (
SELECT MAX(ROWID)
FROM table1 b
WHERE a.name1 = b.name1
AND a.name2 = b.name2
/* additional column comparisons */
);Delete the duplicates:
DELETE FROM table1 a
WHERE ROWID <> (
SELECT MAX(ROWID)
FROM table1 b
WHERE a.name1 = b.name1
AND a.name2 = b.name2
/* additional column comparisons */
);2. GROUP BY method
Identify duplicate groups with GROUP BY and HAVING COUNT(*) > 1, then delete all rows in those groups:
SELECT COUNT(num), MAX(name) FROM student
GROUP BY num
HAVING COUNT(num) > 1; DELETE FROM student
GROUP BY num
HAVING COUNT(num) > 1;This removes every row that participates in a duplicate group.
3. DISTINCT method (small tables)
Create a new table with distinct rows, truncate the original, and copy the data back:
CREATE TABLE table_new AS SELECT DISTINCT * FROM table1 MINUS;
TRUNCATE TABLE table1;
INSERT INTO table1 SELECT * FROM table_new;Additional practical examples
Find rows with duplicate peopleId:
SELECT * FROM people
WHERE peopleId IN (
SELECT peopleId FROM people GROUP BY peopleId HAVING COUNT(peopleId) > 1
);Delete duplicates while keeping the smallest ROWID:
DELETE FROM people
WHERE peopleId IN (
SELECT peopleId FROM people GROUP BY peopleId HAVING COUNT(peopleId) > 1
)
AND ROWID NOT IN (
SELECT MIN(ROWID) FROM people GROUP BY peopleId HAVING COUNT(peopleId) > 1
);Handle duplicates on multiple columns (e.g., peopleId, seq) with similar subqueries.
Simple query to list duplicate IDs:
SELECT * FROM tablename WHERE id IN (
SELECT id FROM tablename GROUP BY id HAVING COUNT(id) > 1
);Cursor‑based approach (SQL Server)
For cases where you need to delete all but one row per duplicate key, a cursor can iterate over each duplicated key and delete excess rows:
DECLARE @max INT, @id INT;
DECLARE cur_rows CURSOR LOCAL FOR
SELECT 主字段, COUNT(*) FROM 表名 GROUP BY 主字段 HAVING COUNT(*) > 1;
OPEN cur_rows;
FETCH cur_rows INTO @id, @max;
WHILE @@FETCH_STATUS = 0
BEGIN
SET ROWCOUNT @max - 1;
DELETE FROM 表名 WHERE 主字段 = @id;
FETCH cur_rows INTO @id, @max;
END;
CLOSE cur_rows;
SET ROWCOUNT 0;Method two suggests using SELECT DISTINCT into a temporary table, dropping the original, and renaming the temp table back, which also eliminates duplicates.
Overall, the article provides a comprehensive toolbox of SQL techniques—self‑joins, ROWID, GROUP BY, DISTINCT, cursor loops, and temporary tables—to both locate and purge duplicate records efficiently.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
