How to Diagnose and Fix Common MySQL Performance and Replication Failures
This guide outlines step‑by‑step methods to identify high load causes, troubleshoot replication interruptions, resolve instance hangs, and address common MySQL errors using commands like SHOW PROCESSLIST, EXPLAIN, iostat, and GTID management.
1. Database Host LOAD Spikes
Full‑table scans
Wrong execution plan for SELECT statements
UPDATE/DELETE without usable indexes, blocking other queries
ALTER TABLE or OPTIMIZE operations
Large data imports/exports, especially logical backups
Business load exceeding server capacity
To pinpoint the real cause, log into the MySQL server and run: SHOW PROCESSLIST; or query the INFORMATION_SCHEMA:
SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST WHERE COMMAND <> 'sleep' AND TIME > 100;If a query runs for a long time, examine its execution plan: EXPLAIN <em>your_sql_statement</em>; Check I/O status with Linux iostat to see read/write rates, average request size, and await times. Identify the offending PID (often a DELETE or UPDATE) and consider killing it: KILL <em>pid</em>; 2. Request‑Handling Fluctuations and Timeouts
Inspect database response using tcprstat Monitor key metrics such as com_select/insert/update/delete, innodb_rows_read, and innodb_logical_read via SHOW STATUS or orzdb -com Analyze slow‑query logs for high Query_time and Rows_examined Use pt‑query‑digest to evaluate index usage
Check General log and Binlog (via mysqlbinlog) for problematic DML
Capture network packets with tcpdump if needed
3. MySQL Thread Blocking
Run processlist and focus on the Time and State columns
Save a stack trace with pstack for deeper analysis
4. Instance‑Level Issues
MySQL appears dead : Verify SHOW PROCESSLIST for a flood of waiting requests; test with a simple DML on a test database to confirm responsiveness.
Check memory and swap usage via free -m and top. If memory is exhausted and load is low, prioritize slow‑query investigation.
Inspect disk space with df -hT and I/O performance with iostat -x. Clean up logs or expand logical volumes if needed.
Review CPU usage of mysqld using top or ps -eo pid,user,comm,pcpu,pmem,vsz | grep mysqld. High CPU under low load also points to inefficient queries.
Validate network latency with ping and interface bandwidth with iptraf. Use tcpdump for packet‑level analysis when anomalies appear.
Perform slow‑query analysis: enable the slow‑query log, set an appropriate long_query_time, and kill offending queries if necessary.
5. Replication Failures
Replication interruption causes:
Network issues or stopped mysqld on the master
Primary‑key conflicts
Data inconsistency between master and slave
Other unknown reasons
To resume replication:
STOP SLAVE; START SLAVE; SHOW SLAVE STATUS\GIf primary‑key conflicts or data drift occur, stop the slave, set the correct GTID, optionally start an empty transaction, then reset GTID handling:
SET @@SESSION.GTID_NEXT='xxxxxxxxxxx'; BEGIN; COMMIT; SET SESSION GTID_NEXT=AUTOMATIC; START SLAVE;6. Miscellaneous MySQL Issues
Log file locations: find error log path with SHOW VARIABLES LIKE '%log_error%'; Slow‑query log path and long_query_time with SHOW VARIABLES LIKE '%slow_query_log_file%'; and SHOW VARIABLES LIKE '%long_query_time%'; Binary log inspection: mysqlbinlog mysql-bin.000001 | more Typical MySQL error 127 (record file crashed): run perror 127 to see OS error details, then repair the table with REPAIR TABLE table_name; Root password reset procedure:
/etc/init.d/mysql stop mysqld_safe --skip-grant-tables & mysql -uroot -p UPDATE mysql.user SET password=PASSWORD('yourpasswd') WHERE user='root'; FLUSH PRIVILEGES; QUITSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Full-Stack DevOps & Kubernetes
Focused on sharing DevOps, Kubernetes, Linux, Docker, Istio, microservices, Spring Cloud, Python, Go, databases, Nginx, Tomcat, cloud computing, and related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
