Analyzing MySQL Slow Log: Identifying Disk I/O and Semi‑Sync Replication Bottlenecks
This article walks through a systematic analysis of MySQL slow‑log entries, correlating them with error‑log flush events, disk‑I/O monitoring, and semi‑synchronous replication traces to pinpoint performance bottlenecks and provide practical troubleshooting steps.
The author, a DBA specializing in MySQL, shares a real‑world case where a client reported slow INSERT operations; inspection of slow.log revealed widespread latency across multiple SQL types.
Step 1 – Macro statistics from slow.log – By aggregating timestamps, the author identified periods with batch‑slow queries, primarily INSERT, DELETE, UPDATE, and COMMIT statements. Example command used:
[root@big hao]# less slow.log |grep "# Tim"|awk -F "." '{print $1}'|termsql -0 "select COL2 time,COUNT(COL2) sql_count from tbl group by COL2"
... (output omitted for brevity) ...Another aggregation showed the distribution of SQL types:
[root@big hao]# less slow.log |grep "# Time" -A 6 |egrep -i "insert|delete|update|select|commit|show" |awk '{print $1}'|sed -e 's/;/ /g' |termsql -0 "select COL0 sql_stat,count(COL0) sql_count from tbl group by COL0"
sql_stat|sql_count
COMMIT|59
DELETE|11
INSERT|321
SELECT|18
UPDATE|6
...Step 2 – Correlating with error.log flush information – The error log showed numerous flush events whose timestamps matched the slow‑log spikes, indicating disk‑write latency during InnoDB page cleaning.
[root@big hao]# less mysql-error.log |grep "2022-08-30T" |grep flush
2022-08-30T00:05:56.685140+08:00 0 [Note] [MY-011953] [InnoDB] Page cleaner took 5301ms to flush 121 and evict 0 pages
... (additional lines omitted) ...Step 3 – Deploying a custom disk‑I/O monitoring script to capture real‑time I/O metrics, confirming abnormal disk performance during the problematic periods.
//磁盘io监控脚本
#!/bin/bash
while sleep 1;
do
echo "########`date '+%F %H:%M:%S'`" >> /app/monitor/diskmoni$(date "+%Y%m%d").log
sar -d -p 1 10 >> /app/monitor/diskmoni$(date "+%Y%m%d").log
doneAdditional analysis of other timestamps (e.g., 2022‑08‑30T11:04:27) showed that when the master’s disk I/O was normal, the slowdown was linked to semi‑synchronous replication latency rather than I/O.
Step 4 – Examining semi‑sync replication strace logs – By enabling rpl_semi_sync_slave_trace_level, the author captured the delay between the master sending binlog events and receiving ACKs, which could reach several seconds due to network or slave‑disk issues.
//开启半同步strace日志
mysql> show variables like 'rpl_semi_sync_slave_trace_level';
+---------------------------------+-------+
| Variable_name | Value |
+---------------------------------+-------+
| rpl_semi_sync_slave_trace_level | 32 |
+---------------------------------+-------+
mysql> set global rpl_semi_sync_slave_trace_level = 16;
... (output omitted) ...Network ping logs confirmed stable connectivity, so the primary culprit remained disk I/O on the slave.
Summary
Commit statements dominate the slow‑log entries, followed by UPDATE and INSERT, which are I/O‑intensive.
When a batch contains simple SELECTs, the slowdown is less likely related to semi‑sync replication.
Even administrative commands like SHOW MASTER STATUS can appear slow under heavy I/O pressure.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
