Why Does MySQL Replication Lag? Causes and Practical Fixes
This article explains what MySQL master‑slave replication lag is, walks through the replication workflow, identifies the main technical reasons for delay such as single‑threaded replay and lock contention, and provides concrete configuration and architectural solutions to reduce or eliminate the lag.
What is replication lag?
Replication lag is the time difference between when the master finishes writing a transaction to its binary log (binlog) and when the slave finishes replaying that transaction from its relay log. During the lag window the data visible on the slave can be out‑of‑sync with the master.
Why does lag occur?
Replication workflow
Master writes binlog : All DML/DDL statements (INSERT, UPDATE, DELETE, ALTER, etc.) are recorded sequentially in the binlog.
Master streams binlog : A dump thread sends the binlog to each replica.
Slave IO thread writes relay log : The replica’s IO thread receives the binlog and stores it in a relay log.
Slave SQL thread replays : The SQL thread reads the relay log and applies the changes to the replica’s data.
Root causes of lag
The SQL thread on the slave is single‑threaded; it must apply events sequentially.
DML and DDL generate random‑access I/O on the replica, which is slower than the sequential writes on the master.
Lock contention on the replica (e.g., long‑running DDL or heavy read queries) blocks the SQL thread.
Large transactions or massive DDL statements keep the SQL thread busy for minutes.
Network latency between master and replica adds additional delay.
Older MySQL versions support only single‑threaded replication, limiting throughput.
Typical conditions that increase lag
Inferior replica hardware : CPU, disk I/O, or memory lower than the master.
High replica load : Read‑heavy workloads compete with the SQL thread.
Too many replicas : Each replica must receive and replay the same binlog stream.
Large transactions : Batch deletes, bulk inserts, or big ALTER statements.
Network bottlenecks : Limited bandwidth or high latency links.
Old MySQL version : Lack of parallel binlog replication.
Mitigation strategies
Upgrade replica hardware to match or exceed the master’s CPU, RAM, and storage performance.
Split large transactions into smaller batches; avoid massive DDL on busy tables.
Scale out reads by adding more replicas (one‑master‑many‑slaves) and distributing query traffic.
Improve network bandwidth between master and replicas (e.g., upgrade from 20 Mbps to 100 Mbps).
Upgrade MySQL to a version that supports multi‑threaded (parallel) replication.
Adjust configuration for safety vs. performance :
# Master (high durability)
sync_binlog = 1
innodb_flush_log_at_trx_commit = 1On a replica used only for backup or low‑latency reads, safety can be relaxed:
# Replica (performance‑focused)
sync_binlog = 0 # or disable binlog
innodb_flush_log_at_trx_commit = 0Architectural pattern : Reserve one low‑safety replica solely for backup or heavy‑read offloading, keeping its sync settings relaxed while other replicas serve production reads.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
