Detect and Cure MySQL Replication Lag: Step‑by‑Step Guide
This article explains how to identify whether MySQL replication delay originates from the I/O thread or SQL thread, demonstrates diagnostic commands, and provides practical tips, configuration tweaks, and monitoring tools such as pt‑stalk to reduce lag and improve replication performance.
How to Discover Replication Lag
MySQL replication uses two threads on the slave: IO_THREAD , which reads the master’s binary log and writes it to a local relay log, and SQL_THREAD , which reads the relay log and applies the events. Determining which thread is lagging is the first step in troubleshooting.
Example of Replication Lag
Run the following commands on the master and slave to compare positions:
mysql-master> SHOW MASTER STATUS;
+------------------+--------------+------------------+------------------------------------------------------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+--------------+------------------+------------------------------------------------------------------+
| mysql-bin.018196 | 15818564 | | bb11b389-d2a7-11e3-b82b-5cf3fcfc8f58:1-2331947 |
+------------------+--------------+------------------+------------------------------------------------------------------+
mysql-slave> SHOW SLAVE STATUS\G
*************************** 1. row ***************************
Slave_IO_State: Queueing master event to the relay log
Master_Host: master.example.com
Master_User: repl
Master_Port: 3306
...
Seconds_Behind_Master: 230775
...The output shows that Slave_IO_Thread is reading an older binlog file (mysql-bin.018192) while the master has already advanced to mysql-bin.018196, indicating I/O lag. The SQL thread is also behind, as seen from the difference between Read_Master_Log_Pos and Exec_Master_Log_Pos.
Tips and Recommendations – Causes and Fixes
IO_THREAD lag : Usually caused by slow network or bandwidth congestion. Enable slave_compressed_protocol or disable binary logging on the slave if point‑in‑time recovery is not needed.
SQL_THREAD lag : Often due to long‑running queries on the slave. Enable log_slow_slave_statements and set log_slow_verbosity='full' to capture slow queries.
Ensure every table has a primary or unique key, especially when using row‑based replication, to avoid full‑table scans.
For large row‑based events, Seconds_Behind_Master can be misleading; consider using pt‑heartbeat for more accurate latency measurement.
When deleting massive amounts of data, consider partitioning and dropping old partitions instead of large DELETE statements.
Automated Monitoring with pt‑stalk
Use pt‑stalk to collect diagnostic data whenever replication lag exceeds a threshold. Example script:
#!/bin/bash
trg_plugin() {
mysqladmin $EXT_ARGV ping &>/dev/null
mysqld_alive=$?
if [[ $mysqld_alive == 0 ]]; then
seconds_behind_master=$(mysql $EXT_ARGV -e "show slave status" --vertical | grep Seconds_Behind_Master | awk '{print $2}')
echo $seconds_behind_master
else
echo 1
fi
}
# Example pt‑stalk invocation
/usr/bin/pt-stalk \
--function=/root/pt-plug.sh \
--variable=seconds_behind_master \
--threshold=300 \
--cycles=60 \
[email protected] \
--log=/root/pt-stalk.log \
--pid=/root/pt-stalk.pid \
--daemonizeAdjust the --threshold and --cycles values to suit your environment; the example triggers data collection when the lag stays above 300 seconds for 60 seconds.
Conclusion
Replication lag is a common but tricky issue in MySQL. By checking the I/O and SQL thread positions, using slow‑query logging, ensuring proper indexing, and employing tools like pt‑heartbeat and pt‑stalk, you can pinpoint the cause and apply targeted fixes to keep your replica up‑to‑date.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
