Databases 5 min read

How to Efficiently Insert Massive Data Without Duplicates in MySQL

This article explains several MySQL techniques—INSERT IGNORE, ON DUPLICATE KEY UPDATE, INSERT…SELECT WHERE NOT EXISTS, and REPLACE—plus a practical MyBatis batch‑insert example, to handle large‑scale data imports while preventing duplicate records.

Programmer DD
Programmer DD
Programmer DD
How to Efficiently Insert Massive Data Without Duplicates in MySQL

The task is simple: batch insert data that may come from other database tables or an external Excel file.

When inserting millions of rows, checking each row for duplicates beforehand is impractical, so efficient SQL strategies are needed.

1. INSERT IGNORE

When an insert causes a duplicate‑key error, the statement is ignored and only a warning is returned. Use this only if you are sure the statement itself is correct.
INSERT IGNORE INTO user (name) VALUES ('telami');
This method is simple, but any error (not just duplicate‑key) will also be ignored.

2. ON DUPLICATE KEY UPDATE

If a primary or unique key conflict occurs, the UPDATE clause runs; using a no‑op update like id=id mimics INSERT IGNORE while still reporting other errors.

INSERT INTO user (name) VALUES ('telami') ON duplicate KEY UPDATE id = id;

The prerequisite is that the column used for duplicate detection must have a PRIMARY or UNIQUE constraint.

3. INSERT … SELECT … WHERE NOT EXISTS

Insert based on a SELECT that only proceeds when a matching row does not already exist, allowing more complex conditions.

INSERT INTO user (name) SELECT 'telami' FROM dual WHERE NOT EXISTS (SELECT id FROM user WHERE id = 1);

This approach uses a sub‑query and may be slightly slower than the previous methods.

4. REPLACE INTO

If a row with the same PRIMARY or UNIQUE key exists, it is deleted first, then the new row is inserted.

REPLACE INTO user SELECT 1, 'telami' FROM books;

This always deletes any existing matching row before inserting the new one.

Practical Example

The author chose the second method (ON DUPLICATE KEY UPDATE) and implemented it with MyBatis for batch insertion, where mobile_number has a UNIQUE constraint. The MyBatis XML snippet:

<insert id="batchSaveUser" parameterType="list">
    insert into user (id,username,mobile_number)
    values
    <foreach collection="list" item="item" index="index" separator=",">
        (#{item.id}, #{item.username}, #{item.mobileNumber})
    </foreach>
    ON duplicate KEY UPDATE id = id
</insert>

With this configuration, rows with duplicate mobile numbers are automatically ignored during batch insertion.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLMyBatisduplicate handlingBatch Insert
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.