One‑Order Four‑Split: Solving MySQL Deadlocks in Million‑Row Imports
During massive daily imports of up to one million reconciliation records, MySQL deadlocks can cripple performance; this article dissects the four classic deadlock conditions, then presents a systematic “One‑Order Four‑Split” strategy—ordered writes, transaction splitting, index management, partitioning, and gap‑lock removal—backed by benchmarks, code samples, and configuration tweaks that cut lock rates by up to 90% and reduce batch times to under ten minutes.
Problem
A data‑reconciliation job imports about one million rows into MySQL each day. When many threads write concurrently, deadlocks frequently occur, causing the import to stall or fail.
Deadlock Basics
A deadlock is a situation where two or more transactions each hold resources the other needs, so none can proceed. The four necessary conditions are:
Mutual exclusion : a resource can be held by only one transaction at a time.
Hold‑and‑wait : a transaction keeps its locked resources while waiting for additional ones.
No preemption : resources cannot be forcibly taken away; they must be released voluntarily.
Circular wait : a cycle of transactions each waiting for the next transaction’s resource.
Breaking any one of these conditions prevents deadlock.
One‑Order Four‑Split Solution
1. Ordered Writes (Local or Distributed)
Sort rows by primary‑key order before inserting so that all threads acquire locks in the same sequence.
Collections.sort(dataList, Comparator.comparing(Data::getId));All threads access pages in the same order.
Avoids the classic “A locks row 1, waits for row 2; B locks row 2, waits for row 1” pattern.
Benchmark: lock‑rate reduction ≈ 40% .
2. Split Transactions (Small Batches)
Divide the million rows into batches of 800‑1500 rows and commit after each batch. This limits the time each transaction holds locks.
transactionTemplate.execute(status -> {
for (int i = 0; i < dataList.size(); i += batchSize) {
int end = Math.min(i + batchSize, dataList.size());
List<Data> subList = dataList.subList(i, end);
dao.batchInsert(subList);
}
return null;
});Each batch finishes within ~50 ms, keeping lock hold time under 2 seconds.
Reduces lock contention and memory pressure.
3. Index Management (Drop‑and‑Rebuild)
Before bulk loading, drop non‑primary indexes that would be updated on every insert, then recreate them after the load.
ALTER TABLE transactions DROP INDEX idx_name;
CREATE INDEX idx_name ON transactions(column_name);
-- Example of a unique index
CREATE UNIQUE INDEX uk_txn_no ON transactions(txn_no);Reduces index‑page locking.
Rebuilding indexes after load is ~ 60% faster than incremental updates.
4. Partitioning and Table Slimming
Partition the table by reconciliation date (RANGE partition) and move rarely used large columns (e.g., JSON fields) to a separate table.
Locks are confined to a single partition, cutting deadlock probability by ≈ 75% .
Row size shrinks from ~2 KB to ~800 B, further lowering page‑lock conflicts.
5. Remove Gap Locks
Switch the isolation level from REPEATABLE READ to READ COMMITTED for the import workload, which disables gap locks.
SET GLOBAL transaction_isolation = 'READ-COMMITTED';Lock range shrinks by ~ 60% , improving throughput.
Extended Strategies
Avoid Mixed Query‑and‑Insert
Use a Redis‑based distributed lock (Redisson) keyed by merchant_id + date to serialize work for the same merchant on the same day.
String lockKey = "lock:merchant_" + merchantId + ":" + date;
RLock lock = redisson.getLock(lockKey);
lock.lock();
// critical section
lock.unlock();Granularity moves from a single row to a merchant‑date dimension, reducing contention by ~ 60% .
Retry on Deadlock
Catch SQLException with error code 1213 and retry using Guava’s Retryer (max 3 attempts, exponential back‑off).
Retryer<Boolean> retryer = RetryerBuilder.<Boolean>newBuilder()
.retryIfExceptionOfType(SQLException.class)
.withWaitStrategy(WaitStrategies.exponentialWait())
.withStopStrategy(StopStrategies.stopAfterAttempt(3))
.build();
retryer.call(() -> {
// database operation
return true;
});Successful retry rate exceeds 90% .
Deadlock Logging and Analysis
Enable innodb_print_all_deadlocks = ON so MySQL records full deadlock traces. A Logstash pipeline extracts the offending SQL, tables, and lock types for automated analysis.
# my.cnf
innodb_print_all_deadlocks = ONConfiguration Tuning
Lock wait timeout : innodb_lock_wait_timeout = 10 (seconds) – forces long‑waiting transactions to abort early.
Deadlock detection : innodb_deadlock_detect = ON – checks for deadlocks every 100 ms and resolves them within 50 ms.
Results
Deadlock occurrence reduced from 5 per 100 rows to near zero.
Import speed improved from 2 hours to 40 minutes for one million rows.
System availability reached 99.99% over a full year.
Tech Freedom Circle
Crazy Maker Circle (Tech Freedom Architecture Circle): a community of tech enthusiasts, experts, and high‑performance fans. Many top‑level masters, architects, and hobbyists have achieved tech freedom; another wave of go‑getters are hustling hard toward tech freedom.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
