Databases 5 min read

How to Remove Duplicate Records in MySQL Tables

This article explains why duplicate rows appeared in production MySQL tables, demonstrates how to identify them with SELECT queries, and provides two SQL solutions—one to delete all duplicates and another to keep a single record per duplicated key—while preserving data integrity.

Laravel Tech Community
Laravel Tech Community
Laravel Tech Community
How to Remove Duplicate Records in MySQL Tables

The overtime work was caused by a production deployment that revealed duplicate data in the online database and a program bug that was quickly fixed. Six tables contained duplicate rows, two of them very large (over 960,000 and 300,000 rows). A previously used Python deduplication script was reused, but its performance was extremely low (about one row per second), estimating eight hours to clean 20,000+ duplicates.

The article then details the deduplication steps.

CREATE TABLE `animal` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(20) DEFAULT NULL,
  `age` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

INSERT INTO `pilipa_dds`.`student` (`id`,`name`,`age`) VALUES ('1','cat','12');
INSERT INTO `pilipa_dds`.`student` (`id`,`name`,`age`) VALUES ('2','dog','13');
INSERT INTO `pilipa_dds`.`student` (`id`,`name`,`age`) VALUES ('3','camel','25');
INSERT INTO `pilipa_dds`.`student` (`id`,`name`,`age`) VALUES ('4','cat','32');
INSERT INTO `pilipa_dds`.`student` (`id`,`name`,`age`) VALUES ('5','dog','42');

The goal is to remove rows with the same name value.

First, identify duplicated names:

SELECT name, COUNT(1)
FROM student
GROUP BY name
HAVING COUNT(1) > 1;

The result shows that cat and dog each appear twice.

Delete All Duplicate Rows (Leave None)

Attempting a direct delete fails with MySQL error 1093 because the same table is used in the subquery.

DELETE FROM student
WHERE NAME IN (
  SELECT NAME
  FROM student
  GROUP BY NAME
  HAVING COUNT(1) > 1
);

MySQL returns: 1093 - You can't specify target table 'student' for update in FROM clause . The reason is that MySQL does not allow updating a table while selecting from it in the same statement.

Solution: materialize the subquery as a derived table and then delete:

DELETE FROM student
WHERE NAME IN (
  SELECT t.NAME
  FROM (
    SELECT NAME FROM student GROUP BY NAME HAVING COUNT(1) > 1
  ) t
);

Delete Duplicates While Keeping One Row per Name

First, find the rows to keep (the smallest id for each name ), then delete everything else:

SELECT *
FROM student
WHERE id NOT IN (
  SELECT t.id
  FROM (
    SELECT MIN(id) AS id FROM student GROUP BY `name`
  ) t
);

Replace the SELECT with DELETE to perform the cleanup:

DELETE FROM student
WHERE id NOT IN (
  SELECT t.id
  FROM (
    SELECT MIN(id) AS id FROM student GROUP BY `name`
  ) t
);

Note: The article also includes a promotional line encouraging readers to follow a Java interview public account for additional resources.

SQLMySQLdata deduplicationDatabase CleanupDuplicate Removal
Laravel Tech Community
Written by

Laravel Tech Community

Specializing in Laravel development, we continuously publish fresh content and grow alongside the elegant, stable Laravel framework.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.