Databases 8 min read

Why Does MySQL Replication Lag? Causes and Practical Fixes

This article explains what MySQL master‑slave replication lag is, walks through the replication workflow, identifies the main technical reasons for delay such as single‑threaded replay and lock contention, and provides concrete configuration and architectural solutions to reduce or eliminate the lag.

ITPUB
ITPUB
ITPUB
Why Does MySQL Replication Lag? Causes and Practical Fixes

What is replication lag?

Replication lag is the time difference between when the master finishes writing a transaction to its binary log (binlog) and when the slave finishes replaying that transaction from its relay log. During the lag window the data visible on the slave can be out‑of‑sync with the master.

Why does lag occur?

Replication workflow

Master writes binlog : All DML/DDL statements (INSERT, UPDATE, DELETE, ALTER, etc.) are recorded sequentially in the binlog.

Master streams binlog : A dump thread sends the binlog to each replica.

Slave IO thread writes relay log : The replica’s IO thread receives the binlog and stores it in a relay log.

Slave SQL thread replays : The SQL thread reads the relay log and applies the changes to the replica’s data.

Root causes of lag

The SQL thread on the slave is single‑threaded; it must apply events sequentially.

DML and DDL generate random‑access I/O on the replica, which is slower than the sequential writes on the master.

Lock contention on the replica (e.g., long‑running DDL or heavy read queries) blocks the SQL thread.

Large transactions or massive DDL statements keep the SQL thread busy for minutes.

Network latency between master and replica adds additional delay.

Older MySQL versions support only single‑threaded replication, limiting throughput.

Typical conditions that increase lag

Inferior replica hardware : CPU, disk I/O, or memory lower than the master.

High replica load : Read‑heavy workloads compete with the SQL thread.

Too many replicas : Each replica must receive and replay the same binlog stream.

Large transactions : Batch deletes, bulk inserts, or big ALTER statements.

Network bottlenecks : Limited bandwidth or high latency links.

Old MySQL version : Lack of parallel binlog replication.

Mitigation strategies

Upgrade replica hardware to match or exceed the master’s CPU, RAM, and storage performance.

Split large transactions into smaller batches; avoid massive DDL on busy tables.

Scale out reads by adding more replicas (one‑master‑many‑slaves) and distributing query traffic.

Improve network bandwidth between master and replicas (e.g., upgrade from 20 Mbps to 100 Mbps).

Upgrade MySQL to a version that supports multi‑threaded (parallel) replication.

Adjust configuration for safety vs. performance :

# Master (high durability)
sync_binlog = 1
innodb_flush_log_at_trx_commit = 1

On a replica used only for backup or low‑latency reads, safety can be relaxed:

# Replica (performance‑focused)
sync_binlog = 0   # or disable binlog
innodb_flush_log_at_trx_commit = 0

Architectural pattern : Reserve one low‑safety replica solely for backup or heavy‑read offloading, keeping its sync settings relaxed while other replicas serve production reads.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performancemysqltroubleshootingdatabasesLag
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.