Master MySQL Slow Query Analysis: Proven SQL Optimization Techniques to Boost Performance
This comprehensive guide walks you through diagnosing MySQL slow queries, from identifying root causes and configuring slow‑query logs to applying advanced indexing, query‑rewriting, and monitoring techniques—complete with real‑world case studies that demonstrate how to cut query times from seconds to milliseconds.
Database Slow Query Analysis and SQL Optimization Practical Techniques: From Beginner to Expert Performance Tuning Guide
Introduction: A Real Production Incident
At 3 a.m., you are woken up by an urgent call: "Online database response time has spiked to 30 seconds, users are complaining, and the order system is almost down!" This is every ops engineer's nightmare, and a scenario I have personally experienced. The root cause was a seemingly simple SQL query, which after optimization reduced the query time from 30 seconds to 0.3 seconds—a 100× improvement.
In this article I share the practical experience accumulated from handling thousands of database performance issues, helping you systematically master slow‑query analysis and core SQL optimization techniques. Whether you are a newcomer to ops or an experienced engineer, this article provides actionable solutions.
1. The Essence of Slow Queries: Why Does Your Database Slow Down?
1.1 Definition and Impact of Slow Queries
A slow query is an SQL statement whose execution time exceeds a predefined threshold. In MySQL, queries taking longer than 10 seconds are logged as slow queries by default, but in production I often set the threshold to 1 second or lower.
The impact of slow queries is far more serious than it appears. A single slow query can consume massive database resources, leading to connection‑pool exhaustion, increased lock wait, soaring memory usage, and ultimately a system‑wide avalanche. I have seen many cases where an unoptimized SQL completely crippled an e‑commerce platform during a promotion peak.
1.2 Root Causes of Slow Queries
Analyzing tens of thousands of slow‑query logs, I identified five fundamental causes:
Missing appropriate indexes : This accounts for over 60 % of slow‑query problems. A full table scan without an index is like searching a dictionary without a table of contents.
Index loss : Even with an index, improper query writing can cause the index to be ignored, e.g., using functions on indexed columns, implicit type conversion, or NOT/!= operators.
Excessive data volume : As business grows, single‑table row counts can reach tens of millions or even billions. Even with indexes, scanning such large data sets degrades performance.
Lock contention : In high‑concurrency scenarios, multiple transactions competing for the same resource cause lock waiting, manifesting as slow queries.
Hardware resource bottlenecks : CPU, memory, or disk I/O saturation will affect database performance.
1.3 Indicators of Slow Queries
Key metrics to quickly identify slow‑query problems:
CPU usage continuously above 80 %
Database connections approaching the maximum
Disk I/O wait time noticeably increasing
Application response time suddenly lengthening
Slow‑query log file growing rapidly
When these signs appear, immediate slow‑query analysis is required.
2. Slow‑Query Analysis Tools and Methodology
2.1 Enabling and Configuring the Slow‑Query Log
-- View current slow‑query configuration
SHOW VARIABLES LIKE 'slow_query%';
SHOW VARIABLES LIKE 'long_query_time';
-- Dynamically enable slow‑query log
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL slow_query_log_file = '/var/log/mysql/slow.log';
SET GLOBAL long_query_time = 1; -- set to 1 second
SET GLOBAL log_queries_not_using_indexes = 'ON'; -- log queries that do not use indexesIn production, permanently set these parameters in my.cnf:
[mysqld]
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 1
log_queries_not_using_indexes = 1
log_throttle_queries_not_using_indexes = 10 -- limit recordings per minute2.2 Using pt‑query‑digest for Deep Analysis
pt‑query‑digestfrom Percona Toolkit is my most frequently used tool for analyzing slow‑query logs.
Installation:
# CentOS/RHEL
yum install percona-toolkit
# Ubuntu/Debian
apt-get install percona-toolkitBasic usage:
# Analyze the entire slow‑query log
pt-query-digest /var/log/mysql/slow.log > slow_analysis.txt
# Analyze only the last hour
pt-query-digest --since '1h' /var/log/mysql/slow.log
# Show top 10 slowest queries
pt-query-digest --limit=10 /var/log/mysql/slow.logThe report includes execution count, total time, average time, lock wait, etc., allowing rapid identification of queries that need optimization.
2.3 Analyzing Execution Plans with EXPLAIN
EXPLAINis essential for SQL optimization. It shows how MySQL will execute a statement.
EXPLAIN SELECT u.name, o.order_no, o.amount
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.created_at > '2024-01-01';Key fields:
type : connection type, from best to worst (system, const, eq_ref, ref, range, index, ALL).
key : the index actually used; NULL means no index.
rows : estimated rows scanned; larger numbers indicate slower queries.
Extra : additional important information (e.g., Using index, Using where, Using temporary, Using filesort).
2.4 Real‑Time Monitoring with Performance Schema
Performance Schema, introduced in MySQL 5.5, provides powerful real‑time performance data.
-- Check if enabled
SHOW VARIABLES LIKE 'performance_schema';
-- View currently executing SQL
SELECT * FROM performance_schema.events_statements_current\G;
-- Top 10 longest‑running SQL
SELECT DIGEST_TEXT, COUNT_STAR as exec_count,
SUM_TIMER_WAIT/1000000000000 as total_latency_sec,
AVG_TIMER_WAIT/1000000000000 as avg_latency_sec
FROM performance_schema.events_statements_summary_by_digest
ORDER BY AVG_TIMER_WAIT DESC
LIMIT 10;3. Practical SQL Optimization Techniques
3.1 Index Optimization Strategies
Index optimization is the core of SQL tuning. Proper indexing can improve performance by hundreds of times.
Creating appropriate indexes
Basic principle: create indexes on columns involved in WHERE, JOIN, ORDER BY, GROUP BY.
-- Single‑column index
CREATE INDEX idx_created_at ON orders(created_at);
-- Composite index (order matters)
CREATE INDEX idx_user_status_created ON orders(user_id, status, created_at);
-- Covering index (includes all columns needed by the query)
CREATE INDEX idx_covering ON orders(user_id, status, amount, created_at);Best practices
Choose high‑selectivity columns first (selectivity = distinct values / total rows).
Follow the left‑most prefix rule for composite indexes.
Avoid redundant indexes; if (a,b) exists, a separate index on (a) is usually unnecessary.
Limit the number of indexes per table (recommended ≤ 5) to avoid write‑performance penalties.
Identify and clean up unused indexes regularly:
-- Find unused indexes
SELECT s.table_schema, s.table_name, s.index_name
FROM information_schema.statistics s
LEFT JOIN performance_schema.table_io_waits_summary_by_index_usage t
ON s.table_schema = t.object_schema
AND s.table_name = t.object_name
AND s.index_name = t.index_name
WHERE t.count_star IS NULL
AND s.table_schema NOT IN ('mysql','performance_schema','information_schema')
AND s.index_name != 'PRIMARY';3.2 Query Rewriting Techniques
Often, simply rewriting SQL yields huge performance gains.
Avoid SELECT * Never use SELECT * in production because it transfers unnecessary data, prevents index covering, and can break when schema changes.
-- Bad example
SELECT * FROM users WHERE status = 'active';
-- Good example
SELECT id, name, email FROM users WHERE status = 'active';Prefer JOIN over subqueries
-- Inefficient subquery
SELECT name FROM users WHERE id IN (
SELECT user_id FROM orders WHERE amount > 1000
);
-- Efficient JOIN
SELECT DISTINCT u.name FROM users u
INNER JOIN orders o ON u.id = o.user_id
WHERE o.amount > 1000;Use EXISTS instead of IN for large subquery results
-- IN (less efficient when subquery returns many rows)
SELECT * FROM users u WHERE EXISTS (
SELECT 1 FROM orders o WHERE o.user_id = u.id AND o.status = 'completed'
);Pagination Optimization
Large offsets kill performance. Use delayed join or cursor pagination.
-- Inefficient pagination with large offset
SELECT * FROM orders ORDER BY id LIMIT 1000000, 20;
-- Delayed join optimization
SELECT o.* FROM orders o
INNER JOIN (
SELECT id FROM orders ORDER BY id LIMIT 1000000, 20
) t ON o.id = t.id;
-- Cursor pagination (recommended)
SELECT * FROM orders WHERE id > 1000000 ORDER BY id LIMIT 20;3.3 Transaction and Lock Optimization
In high‑concurrency scenarios, transaction and lock optimization is crucial.
Shorten transaction duration
Long transactions are performance killers. Keep transactions as short as possible.
# Bad example: time‑consuming API call inside transaction
def process_order(order_id):
with transaction():
order = get_order(order_id)
payment_result = call_payment_api(order) # should be outside
update_order_status(order_id, payment_result)
# Correct example: move external call out of transaction
def process_order(order_id):
order = get_order(order_id)
payment_result = call_payment_api(order)
with transaction():
update_order_status(order_id, payment_result)Avoid lock escalation
Create indexes for UPDATE WHERE clauses to lock only necessary rows.
-- Index for UPDATE
CREATE INDEX idx_status ON orders(status);
UPDATE orders SET processed = 1 WHERE status = 'pending';Optimistic locking
For infrequent conflicts, add a version column.
ALTER TABLE products ADD COLUMN version INT DEFAULT 0;
UPDATE products SET stock = stock - 1, version = version + 1
WHERE id = 100 AND version = 5;
-- Check affected rows; if 0, retry because version changed4. Building Performance Monitoring and Alerting Systems
4.1 Comprehensive Monitoring Metrics
Key system‑level metrics:
CPU usage and load average
Memory usage and swap
Disk I/O (IOPS, throughput, latency)
Network traffic and connection count
MySQL‑specific metrics:
QPS and TPS
Slow‑query count and ratio
Connection and thread count
InnoDB buffer‑pool hit rate
Lock wait and deadlock count
Replication lag (if master‑slave)
4.2 Monitoring Stack with Prometheus and Grafana
Prometheus + Grafana is the most popular open‑source monitoring solution.
Install mysqld_exporter to collect MySQL metrics:
# Download exporter
wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.14.0/mysqld_exporter-0.14.0.linux-amd64.tar.gz
# Create monitoring user
CREATE USER 'exporter'@'localhost' IDENTIFIED BY 'password';
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost';
# Start exporter
./mysqld_exporter --config.my-cnf=".my.cnf"Prometheus scrape config:
# prometheus.yml
scrape_configs:
- job_name: 'mysql'
static_configs:
- targets: ['localhost:9104']
labels:
instance: 'prod-mysql-01'Key Grafana dashboards:
MySQL Overview (ID 7362)
MySQL Query Response Time (ID 11226)
MySQL InnoDB Metrics (ID 7365)
4.3 Reasonable Alert Rules
Alert rules should follow "no missed alerts, few false positives".
# Prometheus alert examples
groups:
- name: mysql_alerts
rules:
- alert: MySQLDown
expr: mysql_up == 0
for: 5m
annotations:
summary: "MySQL service down"
- alert: SlowQueries
expr: rate(mysql_global_status_slow_queries[5m]) > 10
for: 5m
annotations:
summary: "Too many slow queries"
- alert: HighConnections
expr: mysql_global_status_threads_connected / mysql_global_variables_max_connections > 0.8
for: 5m
annotations:
summary: "Connection count near limit"
- alert: InnoDBBufferPoolHitRate
expr: rate(mysql_global_status_innodb_buffer_pool_reads[5m]) / rate(mysql_global_status_innodb_buffer_pool_read_requests[5m]) > 0.1
for: 10m
annotations:
summary: "InnoDB buffer pool hit rate too low"5. Real‑World Case Studies
Case 1: E‑commerce Order Query Optimization
Problem : Order query API response time reached 15 seconds.
SELECT o.*, u.name AS user_name, p.name AS product_name
FROM orders o
LEFT JOIN users u ON o.user_id = u.id
LEFT JOIN order_items oi ON o.id = oi.order_id
LEFT JOIN products p ON oi.product_id = p.id
WHERE o.created_at BETWEEN '2024-01-01' AND '2024-12-31'
AND o.status IN ('pending','processing','shipped')
AND u.region = 'North'
ORDER BY o.created_at DESC
LIMIT 20;Analysis (via EXPLAIN): full table scan on orders, no index used, estimated 5 million rows scanned.
Optimization :
-- Create composite indexes
CREATE INDEX idx_orders_created_status ON orders(created_at, status);
CREATE INDEX idx_users_region ON users(region);
-- Rewrite query to use covering index
SELECT o.*, u.name AS user_name, p.name AS product_name
FROM (
SELECT id FROM orders
WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31'
AND status IN ('pending','processing','shipped')
ORDER BY created_at DESC
LIMIT 20
) t
INNER JOIN orders o ON t.id = o.id
LEFT JOIN users u ON o.user_id = u.id AND u.region = 'North'
LEFT JOIN order_items oi ON o.id = oi.order_id
LEFT JOIN products p ON oi.product_id = p.id
ORDER BY o.created_at DESC;Result : query time dropped from 15 seconds to 0.2 seconds (≈ 75× improvement).
Case 2: User Points Ranking Optimization
Problem : Points ranking query took over 30 seconds.
SELECT user_id,
SUM(points) AS total_points,
COUNT(*) AS transaction_count
FROM point_transactions
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
GROUP BY user_id
ORDER BY total_points DESC
LIMIT 100;Analysis : point_transactions holds 200 million rows; query scans ~30 million rows each time.
Optimization :
-- Create summary table (updated hourly)
CREATE TABLE user_points_summary (
user_id INT PRIMARY KEY,
total_points DECIMAL(10,2),
transaction_count INT,
last_30days_points DECIMAL(10,2),
last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
INDEX idx_last30_points (last_30days_points DESC)
);
-- Hourly upsert
INSERT INTO user_points_summary (user_id, last_30days_points, transaction_count)
SELECT user_id,
SUM(points),
COUNT(*)
FROM point_transactions
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
GROUP BY user_id
ON DUPLICATE KEY UPDATE
last_30days_points = VALUES(last_30days_points),
transaction_count = VALUES(transaction_count);
-- Query from summary table
SELECT user_id,
last_30days_points AS total_points,
transaction_count
FROM user_points_summary
ORDER BY last_30days_points DESC
LIMIT 100;Result : query time reduced from 30 seconds to 0.01 seconds (≈ 3000× improvement).
6. Best‑Practice Summary for Performance Optimization
6.1 Establish a Performance Baseline
Before any optimization, record key baseline metrics:
Average and peak QPS
Slow‑query ratio (target < 0.1 %)
Average response time, P95, P99
InnoDB buffer‑pool hit rate (target > 95 %)
6.2 Prioritize Optimizations
Not every slow query needs immediate fixing. Prioritize by:
Total consumption = execution frequency × average time.
Impact on core business flows.
Low‑effort, high‑gain "quick wins".
6.3 Code Review Mechanism
Implement SQL review before deployment:
All new SQL must provide EXPLAIN output.
Prohibit SELECT * in production.
Use pt‑online‑schema‑change for DDL on large tables.
Batch large operations to avoid long‑running locks.
6.4 Continuous Optimization Process
Performance tuning is ongoing:
Weekly slow‑query log analysis to spot new issues.
Monthly index usage audit.
Quarterly evaluation of sharding or partitioning needs.
Maintain a knowledge base to avoid repeated pitfalls.
Conclusion: Ongoing Learning and Practice
Database performance optimization is a discipline that requires continuous practice and experience accumulation. No one‑size‑fits‑all solution exists; each system has its own characteristics. As ops engineers, we should stay aware of new MySQL features, build a personal case library, collaborate closely with developers to prevent performance problems at the source, and regularly participate in technical exchanges to learn from others' experiences.
Remember, performance optimization never ends, but with the right methodology and tools you can confidently tackle any challenge. I hope this article helps you advance further on the path of database optimization.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
