How to Safely Delete Millions of Rows from a Large Table: Proven Strategies & Code
This article explains why deleting tens of millions of rows is challenging, analyzes transaction, lock, resource, and business impacts, and presents five practical solutions—including batch deletion, table rebuild, partition dropping, online migration, and professional tools—along with best‑practice guidelines and a decision matrix.
Preface
Today we discuss a topic that many DBAs and developers find painful—deleting data from tables with tens of millions of rows.
Some people are at a loss when encountering large‑table deletions: either directly DELETE causing the database to lock, or hesitant to operate.
I have seen many "blood incidents" caused by improper large‑table deletions: long table locks, system outage, and even master‑slave replication lag.
Today we will specifically discuss deleting tens of millions of rows, hoping to help you.
1. Why is large‑table deletion so hard?
Before diving into technical solutions, let's understand why deleting tens of millions of rows is difficult.
Some may think: "It's just a DELETE statement, what's hard?"
Actually there's a lot of theory.
Underlying principles of database delete operation
To intuitively understand how delete works, I drew a flow diagram:
The diagram shows that a simple DELETE hides many complex operations.
Let's analyze each challenge:
1. Transaction and lock challenges
-- A seemingly simple delete operation
DELETE FROM user_operation_log
WHERE create_time < '2023-01-01';
-- Actually MySQL handles it as:
-- 1. Acquire table write lock
-- 2. Scan 10,000,000 rows
-- 3. For each matching row:
-- - Write undo log (for rollback)
-- - Write redo log (for recovery)
-- - Update all related indexes
-- - Mark row as deleted
-- 4. Space is released only after transaction commit2. Resource consumption issues
Disk I/O : massive writes to undo log, redo log, data files, and index files
CPU : index maintenance, condition evaluation, transaction management
Memory : buffer pool management, lock information
Network : huge amount of master‑slave sync data
3. Business impact risks
Lock wait timeout : other queries blocked
Master‑slave lag : replica cannot keep up
Disk space : undo log explosion fills disk
Performance degradation : overall DB performance affected
Some may ask: "We use cloud DB, do these problems still exist?"
My experience: cloud DB reduces operational complexity, but the underlying principles and limits still exist.
2. Solution 1: Batch Deletion (most common)
Batch deletion splits a large operation into many small ones.
Implementation principle
Specific implementation
Method 1: Batch by primary key
-- Stored procedure for batch delete
DELIMITER $$
CREATE PROCEDURE batch_delete_by_id()
BEGIN
DECLARE done INT DEFAULT FALSE;
DECLARE batch_size INT DEFAULT 1000;
DECLARE max_id BIGINT;
DECLARE min_id BIGINT;
DECLARE current_id BIGINT DEFAULT 0;
-- Get range to delete
SELECT MIN(id), MAX(id) INTO min_id, max_id FROM user_operation_log WHERE create_time < '2023-01-01';
WHILE current_id < max_id DO
-- Delete a batch
DELETE FROM user_operation_log
WHERE id BETWEEN current_id AND current_id + batch_size - 1
AND create_time < '2023-01-01';
COMMIT;
-- Sleep to let DB breathe
DO SLEEP(0.1);
SET current_id = current_id + batch_size;
INSERT INTO delete_progress_log (NOW(), current_id, batch_size);
END WHILE;
END $$
DELIMITER ;Method 2: Batch by time (Java code)
// Java code for time‑based batch delete
@Service
@Slf4j
public void batchDeleteByTime(String tableName, String timeColumn, Date startTime, Date endTime, int batchDays) {
// implementation omitted for brevity
}Method 3: Use LIMIT batch delete
-- Simple LIMIT batch delete
DELIMITER $$
CREATE PROCEDURE batch_delete_with_limit()
BEGIN
DECLARE done INT DEFAULT 0;
DECLARE batch_size INT DEFAULT 1000;
DECLARE total_deleted INT DEFAULT 0;
WHILE done = 0 DO
DELETE FROM user_operation_log
WHERE create_time < '2023-01-01'
LIMIT batch_size;
SET done = ROW_COUNT() = 0;
SET total_deleted = total_deleted + ROW_COUNT();
COMMIT;
DO SLEEP(0.1);
IF total_deleted % 10000 = 0 THEN
SELECT CONCAT('Deleted: ', total_deleted, ' rows') AS progress;
END IF;
END WHILE;
SELECT CONCAT('Delete complete! Total: ', total_deleted, ' rows') AS result;
END $$
DELIMITER ;Best practices for batch deletion
Batch size selection
Small tables: 1,000‑5,000 rows per batch
Large tables: 100‑1,000 rows per batch
Adjust based on actual situation
Sleep time control
Peak hours: sleep 1‑2 seconds
Off‑peak: sleep 100‑500 ms
Night maintenance: no sleep or short sleep
Monitoring and adjustment
Monitor database load
Observe replication lag
Dynamically adjust parameters
3. Solution 2: Create New Table + Rename
When needing to delete most of the data, creating a new table and renaming is often more efficient.
Implementation principle
Specific implementation
-- Step1: Create new table with same structure
CREATE TABLE user_operation_log_new LIKE user_operation_log;
-- Step2: Insert data to keep
INSERT INTO user_operation_log_new SELECT * FROM user_operation_log WHERE create_time >= '2023-01-01';
-- Step3: Create indexes (after data load for efficiency)
ALTER TABLE user_operation_log_new ADD INDEX idx_create_time(create_time);
ALTER TABLE user_operation_log_new ADD INDEX idx_user_id(user_id);
-- Step4: Verify data
SELECT (SELECT COUNT(*) FROM user_operation_log_new) AS new_count,
(SELECT COUNT(*) FROM user_operation_log WHERE create_time >= '2023-01-01') AS expected_count;
-- Step5: Atomic switch (short table lock)
RENAME TABLE user_operation_log TO user_operation_log_old,
user_operation_log_new TO user_operation_log;
-- Step6: Drop old table (optional immediate or delayed)
DROP TABLE user_operation_log_old;Java code to assist
@Service
@Slf4j
public class TableRebuildService {
@Autowired
private JdbcTemplate jdbcTemplate;
/** Rebuild table for deletion */
public void rebuildTableForDeletion(String sourceTable, String condition) {
String newTable = sourceTable + "_new";
String oldTable = sourceTable + "_old";
try {
// 1. Create new table
log.info("Creating new table: {}", newTable);
jdbcTemplate.execute("CREATE TABLE " + newTable + " LIKE " + sourceTable);
// 2. Insert data to keep
log.info("Importing data to keep");
String insertSql = String.format("INSERT INTO %s SELECT * FROM %s WHERE %s", newTable, sourceTable, condition);
int keptCount = jdbcTemplate.update(insertSql);
log.info("Successfully imported {} rows", keptCount);
// 3. Create indexes
log.info("Creating indexes");
createIndexes(newTable);
// 4. Validate data
log.info("Validating data");
if (!validateData(sourceTable, newTable, condition)) {
throw new RuntimeException("Data validation failed");
}
// 5. Atomic switch
log.info("Switching tables");
switchTables(sourceTable, newTable, oldTable);
// 6. Drop old table
log.info("Dropping old table");
dropTableSafely(oldTable);
log.info("Table rebuild deletion completed!");
} catch (Exception e) {
log.error("Exception during table rebuild", e);
cleanupTempTable(newTable);
throw e;
}
}
// helper methods (createIndexes, validateData, switchTables, dropTableSafely, cleanupTempTable) omitted for brevity
}Applicable scenarios
Need to delete more than 50% of data
Business can tolerate a short write pause during rename
Enough disk space for both old and new tables
4. Solution 3: Partition Table Deletion
If the table is already partitioned or can be converted, deletion becomes very simple.
Implementation principle
Specific implementation
Method 1: Use existing partitions
-- View partition information
SELECT table_name, partition_name, table_rows
FROM information_schema.partitions
WHERE table_name = 'user_operation_log';
-- Drop whole partitions (seconds)
ALTER TABLE user_operation_log DROP PARTITION p202201, p202202;
-- Procedure to drop expired partitions automatically
DELIMITER $$
CREATE PROCEDURE auto_drop_expired_partitions()
BEGIN
DECLARE expired_partition VARCHAR(64);
DECLARE done INT DEFAULT FALSE;
DECLARE cur CURSOR FOR
SELECT partition_name
FROM information_schema.partitions
WHERE table_name = 'user_operation_log'
AND partition_name LIKE 'p%'
AND STR_TO_DATE(REPLACE(partition_name,'p',''),'%Y%m') < DATE_SUB(NOW(), INTERVAL 12 MONTH);
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
OPEN cur;
read_loop: LOOP
FETCH cur INTO expired_partition;
IF done THEN LEAVE read_loop; END IF;
SET @sql = CONCAT('ALTER TABLE user_operation_log DROP PARTITION ', expired_partition);
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
INSERT INTO partition_clean_log (NOW(), expired_partition, 'DROPPED');
END LOOP;
CLOSE cur;
END $$
DELIMITER ;Method 2: Convert ordinary table to partitioned
-- Create partitioned table
CREATE TABLE user_operation_log_partitioned (
id BIGINT AUTO_INCREMENT,
user_id BIGINT,
operation VARCHAR(100),
create_time DATETIME,
PRIMARY KEY (id, create_time)
) PARTITION BY RANGE (YEAR(create_time)*100 + MONTH(create_time)) (
PARTITION p202201 VALUES LESS THAN (202202),
PARTITION p202202 VALUES LESS THAN (202203),
PARTITION p202203 VALUES LESS THAN (202204),
PARTITION p202204 VALUES LESS THAN (202205),
PARTITION pfuture VALUES LESS THAN MAXVALUE
);
-- Migrate data
INSERT INTO user_operation_log_partitioned SELECT * FROM user_operation_log;
-- Switch tables
RENAME TABLE user_operation_log TO user_operation_log_old,
user_operation_log_partitioned TO user_operation_log;
-- Add future partitions regularly
ALTER TABLE user_operation_log REORGANIZE PARTITION pfuture INTO (
PARTITION p202205 VALUES LESS THAN (202206),
PARTITION p202206 VALUES LESS THAN (202207),
PARTITION pfuture VALUES LESS THAN MAXVALUE
);Advantages of partition tables
Deletion efficiency extremely high : drop partition files directly
No impact on business : no table‑lock risk
Easy management : can be automated
Query optimization : partition pruning improves performance
5. Solution 4: Use Temporary Table Synchronization
For online deletion without stopping service, use a temporary‑table sync approach.
Implementation principle
Specific implementation
@Service
@Slf4j
public class OnlineTableMigrationService {
@Autowired
private JdbcTemplate jdbcTemplate;
/** Online table migration delete */
public void onlineMigrationDelete(String sourceTable, String condition) {
String newTable = sourceTable + "_new";
String tempTable = sourceTable + "_temp";
try {
// Phase 1: Preparation
log.info("=== Phase 1: Preparation ===");
prepareMigration(sourceTable, newTable, tempTable);
// Phase 2: Double write
log.info("=== Phase 2: Double write ===");
enableDoubleWrite(sourceTable, newTable);
// Phase 3: Data sync
log.info("=== Phase 3: Data sync ===");
syncExistingData(sourceTable, newTable, condition);
// Phase 4: Validation
log.info("=== Phase 4: Validation ===");
if (!validateDataSync(sourceTable, newTable)) {
throw new RuntimeException("Data sync validation failed");
}
// Phase 5: Switch
log.info("=== Phase 5: Switch ===");
switchToNewTable(sourceTable, newTable, tempTable);
// Phase 6: Cleanup
log.info("=== Phase 6: Cleanup ===");
cleanupAfterSwitch(sourceTable, tempTable);
log.info("Online migration delete completed!");
} catch (Exception e) {
log.error("Exception during online migration", e);
disableDoubleWrite();
throw e;
}
}
// Helper methods (prepareMigration, enableDoubleWrite, syncExistingData, validateDataSync,
// switchToNewTable, cleanupAfterSwitch, disableDoubleWrite) omitted for brevity
}6. Solution 5: Professional Tools
For extremely large tables or complex deletion needs, use specialized tools.
1. pt‑archiver (Percona Toolkit)
# Install Percona Toolkit (Ubuntu/Debian)
sudo apt-get install percona-toolkit
# Dry‑run: archive then delete data
pt-archiver \
--source h=localhost,D=test,t=user_operation_log \
--where "create_time < '2023-01-01'" \
--limit 1000 \
--commit-each \
--sleep 0.1 \
--statistics \
--progress 10000 \
--why-not \
--dry-run
# Actual execution
pt-archiver \
--source h=localhost,D=test,t=user_operation_log \
--where "create_time < '2023-01-01'" \
--limit 1000 \
--commit-each \
--sleep 0.1 \
--purge2. Custom tool
@Component
@Slf4j
public class SmartDeleteTool {
@Autowired
private JdbcTemplate jdbcTemplate;
/** Smart delete decision */
public void smartDelete(String tableName, String condition) {
try {
// 1. Analyze table state
TableAnalysisResult analysis = analyzeTable(tableName, condition);
// 2. Choose best strategy
DeleteStrategy strategy = chooseBestStrategy(analysis);
// 3. Execute delete
executeDelete(strategy, tableName, condition);
} catch (Exception e) {
log.error("Smart delete failed", e);
throw e;
}
}
// analysis and strategy methods omitted for brevity
}
enum DeleteStrategy { BATCH_DELETE, BATCH_DELETE_WITH_PAUSE, TABLE_REBUILD, PARTITION_DROP, ONLINE_MIGRATION }
@Data
class TableAnalysisResult {
private long totalRows;
private long deleteRows;
private double deleteRatio;
private boolean hasPartition;
private boolean hasPrimaryKey;
private int indexCount;
private double systemLoad;
}7. Solution Comparison and Selection Guide
To help choose the right solution, here's a detailed comparison table:
Solution
Applicable Scenario
Advantages
Disadvantages
Risk Level
Batch Delete
Small batch deletions, delete ratio <30%
Simple implementation, no downtime
Long execution time, possible locks
Medium
Table Rebuild
Delete ratio >50%, short write pause acceptable
Fast execution, cleans table fragments
Requires pause, extra disk space
High
Partition Delete
Table already partitioned or can be partitioned
Seconds to complete, no performance impact
Needs upfront planning, conversion cost
Low
Online Sync
Zero downtime required, critical tables
No business impact, safe and reliable
Complex implementation, longer duration
Medium
Professional Tools
Complex scenarios, extremely large tables
Powerful features, automatic optimization
Learning curve, external dependency
Medium
Selection decision flowchart
Practical advice
Validate in a test environment before any production change.
Backup data prior to deletion.
Execute during business low‑peak periods.
Monitor database metrics and set alert thresholds.
Prepare a rollback plan.
Conclusion
After the detailed analysis, we summarize the core points for deleting tens of millions of rows.
Core principles
Safety first: ensure data safety for any delete operation.
Impact minimization: reduce effect on business as much as possible.
Efficiency priority: choose the most efficient solution for the scenario.
Observability: the whole process must be monitorable and controllable.
Technical selection mnemonic
Look at partitions, judge ratio, decide plan Has partition → drop partition directly. Delete few → batch delete. Delete many → rebuild table. Cannot pause → online sync.
Final advice
Prevention over cure: use data lifecycle management to clean data regularly.
Reasonable architecture: consider cleanup strategies early in design.
Master the tools: be proficient with various deletion utilities.
Accumulate experience: summarize lessons after each operation.
Remember: there is no universally best solution, only the most suitable one for your context.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
