Databases 34 min read

How to Diagnose and Optimize MySQL Slow Queries Beyond Adding Indexes

This guide walks through a systematic approach to identify, analyze, and fix MySQL slow queries by enabling the slow‑query log, interpreting its format, using tools like mysqldumpslow and pt‑query‑digest, examining execution plans with EXPLAIN, designing proper indexes, rewriting SQL, tuning server parameters, and establishing continuous monitoring to prevent regressions.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
How to Diagnose and Optimize MySQL Slow Queries Beyond Adding Indexes

Background and Problem

Slow queries are a common cause of sluggish application responses. When users report "page loading slowly" or "query timeout", many ops engineers immediately tell developers to "add an index". However, adding an index is only one of many possible optimizations, and doing so blindly can be ineffective or even harmful.

Effective slow‑query optimization requires a systematic method: first confirm the existence of a slow query, then analyze its execution plan, understand the optimizer’s decisions, locate the real bottleneck, and finally choose the most appropriate optimization technique. Techniques include creating suitable indexes, rewriting SQL, adjusting database configuration, redesigning table structures, sharding, or adding caches.

This article uses MySQL as an example but the methodology applies to PostgreSQL, Oracle, and other mainstream databases, with only syntax differences.

1. Discovering and Confirming Slow Queries

1.1 Enable the Slow‑Query Log

The MySQL slow‑query log is the foundation for diagnosing slow queries. It is disabled by default and must be turned on via configuration.

-- View slow‑query‑related variables
SHOW VARIABLES LIKE 'slow_query%';
SHOW VARIABLES LIKE 'long_query_time';
SHOW VARIABLES LIKE 'log_output';

-- Check if the slow‑query log is enabled
SHOW VARIABLES LIKE 'slow_query_log';

-- View the slow‑query log file path
SHOW VARIABLES LIKE 'slow_query_log_file';

Configure the log (changes are lost after restart unless added to the config file):

-- Enable the slow‑query log
SET GLOBAL slow_query_log = 'ON';

-- Set the slow‑query threshold to 1 second (can be a float)
SET GLOBAL long_query_time = 1;

-- Set log output format (TABLE or FILE)
SET GLOBAL log_output = 'FILE';

-- Set the log file path
SET GLOBAL slow_query_log_file = '/var/log/mysql/slow.log';

Edit /etc/mysql/my.cnf or /etc/my.cnf to make the settings permanent, then restart MySQL:

systemctl restart mysql
systemctl restart mysqld

1.2 Slow‑Query Log Format

A typical entry looks like:

# Time: 2024-01-15T10:30:45.123456Z
# User@Host: app_user[app_user] @ localhost []
# Query_time: 5.234567  Lock_time: 0.001234 Rows_sent: 100  Rows_examined: 50000
SET timestamp=1705315845;
SELECT * FROM orders WHERE user_id = 12345 AND status = 'paid' ORDER BY created_at DESC LIMIT 20;

Key fields: Time: execution time of the query. User@Host: user and host that ran the query. Query_time: actual execution time (seconds) – the core metric. Lock_time: time spent waiting for locks. Rows_sent: rows returned to the client. Rows_examined: rows scanned; a high ratio of Rows_examined to Rows_sent indicates inefficiency.

When Rows_examined is much larger than Rows_sent, the query scans many rows before finding matches, a typical optimization target.

1.3 Analyze the Log with mysqldumpslow

Reading the raw log is hard; mysqldumpslow aggregates it:

# Install MySQL (already provides the tool)
mysqldumpslow -t 10 /var/log/mysql/slow.log

# Options:
# -t N   show top N slowest queries
# -s C   sort by average query time (c=count, t=time, l=lock, r=rows)
# -s S   sort by total query time
# -s R   sort by average rows examined
# -a     do not group similar queries
# -g PAT only show queries matching PAT

# Examples:
mysqldumpslow -t 10 -s t /var/log/mysql/slow.log   # top 10 by avg time
mysqldumpslow -t 10 -s r /var/log/mysql/slow.log   # top 10 by rows examined
mysqldumpslow -t 10 -s c /var/log/mysql/slow.log   # top 10 by count
mysqldumpslow -t 10 -g 'orders' /var/log/mysql/slow.log   # filter by table
mysqldumpslow -t 10 -a -g 'SELECT.*FROM.*WHERE' /var/log/mysql/slow.log   # regex filter

1.4 Deep Analysis with pt‑query‑digest

pt‑query‑digest

(Percona Toolkit) provides richer analysis than mysqldumpslow:

# Install Percona Toolkit
apt-get install percona-toolkit   # Debian/Ubuntu
yum install percona-toolkit       # RHEL/CentOS

# Basic usage
pt-query-digest /var/log/mysql/slow.log

# Output to file
pt-query-digest /var/log/mysql/slow.log > /tmp/query_analysis.txt

# Analyze last 24 h (requires timestamps in the log)
pt-query-digest --since '24h' /var/log/mysql/slow.log

# Analyze a specific time window
pt-query-digest --since '2024-01-15 10:00:00' --until '2024-01-15 12:00:00' /var/log/mysql/slow.log

# Show response‑time distribution
pt-query-digest --type genlog /var/log/mysql/slow.log

The output includes a histogram of execution times, a list of queries sorted by response time, occurrence counts, average execution time, rows examined/sent, a summary of the execution plan, and flags that may indicate problems.

2. Analyzing Execution Plans with EXPLAIN

2.1 Basic EXPLAIN

EXPLAIN

shows how MySQL’s optimizer will execute a query and is essential for slow‑query diagnosis.

-- Simple format
EXPLAIN SELECT * FROM orders WHERE user_id = 12345;

-- JSON format for more detail
EXPLAIN FORMAT=JSON SELECT * FROM orders WHERE user_id = 12345;

-- UPDATE, DELETE, INSERT examples
EXPLAIN UPDATE orders SET status = 'shipped' WHERE order_id = 100;
EXPLAIN DELETE FROM orders WHERE status = 'cancelled';

2.2 EXPLAIN Output Fields

Each column provides optimization clues:

+----+-------------+-------+------+-------------+------+--------+-------+
| id | select_type | table | type | key         | rows | filtered | Extra |
+----+-------------+-------+------+-------------+------+--------+-------+
|  1 | SIMPLE      | o     | ref  | idx_user    |   50 | 100.00 | Using |
|  1 | SIMPLE      | u     | const| PRIMARY     |    1 | 100.00 |       |
+----+-------------+-------+------+-------------+------+--------+-------+
id

: order of SELECT statements in a complex query; larger values have lower priority. select_type: query type (SIMPLE, PRIMARY, SUBQUERY, DERIVED, UNION, etc.). table: table name. type: access method – the most critical field. Values from best to worst: system, const, eq_ref, ref, ref_or_null, index_merge, unique_subquery, index_subquery, range, index, ALL. possible_keys: indexes that could be used. key: index actually chosen. key_len: length of the used index. ref: columns compared with the index. rows: estimated rows to scan. filtered: percentage of rows remaining after filtering. Extra: additional info such as Using filesort, Using temporary, Using index, Using where, etc.

2.3 type Field Details

From best to worst: system, const, eq_ref, ref, range, index, ALL. When type is ALL, the query performs a full table scan and needs optimization.

-- Query with type ALL (needs index)
EXPLAIN SELECT * FROM orders WHERE created_at > '2024-01-01';

-- After creating an index, type becomes range
CREATE INDEX idx_orders_created_at ON orders(created_at);
EXPLAIN SELECT * FROM orders WHERE created_at > '2024-01-01';

2.4 Extra Field Details

Common values and meanings: Using filesort: MySQL cannot sort via an index; extra sorting is required. Using temporary: A temporary table is created, often due to ORDER BY or GROUP BY. Using index: A covering index satisfies the query without reading the table. Using index condition: Index condition push‑down is active. Using where: WHERE filtering occurs at the storage engine level. Using join buffer: Join buffer is used for the join.

-- Filesort example (needs optimization)
EXPLAIN SELECT * FROM orders WHERE user_id = 123 ORDER BY created_at DESC;

-- Fix by creating a composite index
CREATE INDEX idx_orders_user_created ON orders(user_id, created_at DESC);
EXPLAIN SELECT * FROM orders WHERE user_id = 123 ORDER BY created_at DESC;

2.5 EXPLAIN ANALYZE (MySQL 8.0+)

MySQL 8.0 introduced EXPLAIN ANALYZE, which actually runs the query and reports real execution statistics, allowing comparison between estimated and actual rows and times.

EXPLAIN ANALYZE SELECT o.*, u.name FROM orders o JOIN users u ON o.user_id = u.id WHERE o.user_id = 12345 AND o.status = 'paid' ORDER BY o.created_at DESC LIMIT 20;

The output shows estimated rows, actual rows, and real execution time, helping to spot stale statistics or sub‑optimal plans.

3. Index Creation and Optimization

3.1 Index Fundamentals

MySQL mainly uses B‑Tree indexes. In a clustered index, leaf nodes contain the full row; in a secondary index, leaf nodes contain pointers to the row. Characteristics:

All leaf nodes are on the same level → low height → stable lookup performance.

Leaf nodes linked by a list → efficient range scans.

Each node stores many keys → wide nodes, shallow trees.

Benefits: dramatically reduces scanned rows, avoids sorting and temporary tables, enables covering scans. Costs: extra disk space, slower writes (INSERT/UPDATE/DELETE must maintain the index), and index creation locks tables.

3.2 Principles for Creating Indexes

Guidelines:

Choose columns with high selectivity (distinct/total ratio). Compute with SELECT COUNT(DISTINCT col) / COUNT(*) FROM tbl.

-- Check selectivity of columns
SELECT COUNT(DISTINCT status) / COUNT(*) FROM orders;   -- low selectivity
SELECT COUNT(DISTINCT user_id) / COUNT(*) FROM orders; -- high selectivity

Composite indexes follow the left‑most prefix rule. An index INDEX idx(a,b,c) can serve queries filtering on a, a,b, or a,b,c, but not on b alone.

-- Composite index example
CREATE INDEX idx_orders_user_status_created ON orders(user_id, status, created_at);

-- Queries that can use the index
SELECT * FROM orders WHERE user_id = 123;
SELECT * FROM orders WHERE user_id = 123 AND status = 'paid';
SELECT * FROM orders WHERE user_id = 123 AND status = 'paid' AND created_at > '2024-01-01';

-- Query that cannot use the index (leftmost column missing)
SELECT * FROM orders WHERE status = 'paid';

3.3 Index Creation in Practice

Based on log analysis and EXPLAIN results, create indexes that match the WHERE conditions and ORDER BY columns:

-- Example slow query
-- SELECT * FROM orders WHERE user_id = 12345 AND status = 'paid' ORDER BY created_at DESC LIMIT 20;

-- Composite index with high‑selectivity column first, then ORDER BY column
CREATE INDEX idx_orders_user_status_created ON orders(user_id, status, created_at DESC);

-- For MySQL 5.7 (no descending index support) create two indexes
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
CREATE INDEX idx_orders_user_created ON orders(user_id, created_at DESC);

3.4 Index Usage Caveats

Avoid functions or calculations on indexed columns, as they invalidate the index:

-- Index disabled
SELECT * FROM orders WHERE YEAR(created_at) = 2024;
SELECT * FROM orders WHERE created_at + INTERVAL 1 DAY > NOW();

-- Correct usage
SELECT * FROM orders WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';

Avoid leading wildcards in LIKE patterns; they force a full scan. Prefix wildcards can use an index.

-- Index disabled
SELECT * FROM orders WHERE order_no LIKE '%123%';
SELECT * FROM orders WHERE order_no LIKE '%123';

-- Index usable
SELECT * FROM orders WHERE order_no LIKE 'ABC123%';

Match data types when comparing columns; comparing a string to an integer prevents index usage.

-- Wrong type
SELECT * FROM orders WHERE user_id = '12345';

-- Correct type
SELECT * FROM orders WHERE user_id = 12345;

3.5 Verifying Index Usage

After creating an index, confirm it is used:

-- List indexes on a table
SHOW INDEX FROM orders;

-- Check usage with EXPLAIN
EXPLAIN SELECT * FROM orders WHERE user_id = 12345;

-- View index cardinality (selectivity)
SHOW TABLE STATUS LIKE 'orders';

-- MySQL 8.0+ index statistics
SELECT * FROM mysql.index_statistics WHERE table_name = 'orders';

4. SQL Statement Optimization

4.1 Common Inefficient Patterns

Avoid SELECT *; specify needed columns.

-- Inefficient
SELECT * FROM orders WHERE order_id = 12345;

-- Efficient
SELECT order_id, user_id, status, total_amount, created_at FROM orders WHERE order_id = 12345;

Use LIMIT to bound result sets.

-- Inefficient (may return huge set)
SELECT * FROM orders WHERE user_id = 12345 ORDER BY created_at DESC;

-- Efficient
SELECT * FROM orders WHERE user_id = 12345 ORDER BY created_at DESC LIMIT 20;

Break large queries into smaller steps to reduce scanned rows.

-- Inefficient single large join
SELECT o.*, u.*, p.* FROM orders o JOIN users u ON o.user_id = u.id JOIN products p ON o.product_id = p.id WHERE o.created_at > '2024-01-01';

-- Efficient two‑step approach
SELECT * FROM orders WHERE created_at > '2024-01-01' LIMIT 1000;
SELECT u.* FROM users WHERE id IN (SELECT DISTINCT user_id FROM orders WHERE created_at > '2024-01-01');

4.2 Optimizing ORDER BY and GROUP BY

Make ORDER BY use an index to avoid filesort:

CREATE INDEX idx_orders_user_status_created ON orders(user_id, status, created_at DESC);
SELECT * FROM orders WHERE user_id = 123 ORDER BY status, created_at DESC LIMIT 20;

Similarly, GROUP BY can benefit from an index on the grouping column:

CREATE INDEX idx_orders_created_at ON orders(created_at);
SELECT DATE(created_at), COUNT(*), SUM(total_amount) FROM orders GROUP BY DATE(created_at);

4.3 Optimizing JOINs

Ensure join columns are indexed:

CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_users_id ON users(id);
EXPLAIN SELECT o.*, u.name FROM orders o JOIN users u ON o.user_id = u.id WHERE o.user_id = 12345;

Prefer driving the join with the smaller table to reduce scans.

-- Small table drives large table (better)
SELECT o.*, u.name FROM users u JOIN orders o ON o.user_id = u.id WHERE u.status = 'vip';

-- Large table drives small table (potentially slower)
SELECT o.*, u.name FROM orders o JOIN users u ON o.user_id = u.id WHERE u.status = 'vip';

4.4 Use EXISTS Instead of IN for Large Subqueries

-- IN with large subquery (slow)
SELECT * FROM orders WHERE user_id IN (SELECT id FROM users WHERE status = 'vip');

-- EXISTS (usually faster)
SELECT * FROM orders o WHERE EXISTS (SELECT 1 FROM users u WHERE u.id = o.user_id AND u.status = 'vip');

5. Database Configuration Tuning

5.1 Key Configuration Parameters

[mysqld]
# InnoDB buffer pool (70‑80% of RAM)
innodb_buffer_pool_size = 12G

# Log file size
innodb_log_file_size = 1G

# Flush log strategy (affects durability vs. write performance)
innodb_flush_log_at_trx_commit = 1

# Max connections
max_connections = 500

# Query cache (removed in MySQL 8.0)
query_cache_size = 0
query_cache_type = 0

# Temp table sizes
tmp_table_size = 256M
max_heap_table_size = 256M

# Slow‑query log settings
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 1
log_queries_not_using_indexes = 1

5.2 InnoDB Buffer Pool Optimization

-- View buffer‑pool usage
SHOW STATUS LIKE 'Innodb_buffer_pool%';

-- View configured size
SHOW VARIABLES LIKE 'innodb_buffer_pool_size';

-- Dynamically resize (MySQL 5.7+)
SET GLOBAL innodb_buffer_pool_size = 12873741824;  -- 12 GB

-- Set number of buffer‑pool instances (usually CPU core count)
innodb_buffer_pool_instances = 8

-- Pre‑load hot data on startup
innodb_buffer_pool_load_at_startup = 1

5.3 Connection Management

-- Current connections
SHOW STATUS LIKE 'Threads_connected';
SHOW STATUS LIKE 'Max_used_connections';

-- Max connections variable
SHOW VARIABLES LIKE 'max_connections';

-- Increase max connections
SET GLOBAL max_connections = 1000;

-- List processes
SHOW PROCESSLIST;
SHOW FULL PROCESSLIST;

-- Kill idle connections (>1 h)
SELECT CONCAT('KILL ',id,';') FROM information_schema.processlist WHERE Command='Sleep' AND Time>3600;

6. Real‑World Slow‑Query Optimization Cases

6.1 Case 1 – Pagination Optimization

Deep pagination with large offsets is costly because MySQL must scan offset + limit rows.

-- Inefficient deep pagination
SELECT * FROM orders ORDER BY order_id DESC LIMIT 1000000, 20;

-- Analysis shows huge rows_examined.
EXPLAIN SELECT * FROM orders ORDER BY order_id DESC LIMIT 1000000, 20;

-- Solution 1: cursor‑based pagination (use last seen ID)
SELECT * FROM orders WHERE order_id < 1234567 ORDER BY order_id DESC LIMIT 20;

-- Solution 2: client stores last_id and uses it in the next query.

-- Solution 3: if offset must be used, select only indexed column to avoid filesort
SELECT order_id FROM orders ORDER BY order_id DESC LIMIT 1000000, 20;

6.2 Case 2 – Aggregation Query Optimization

Scanning the whole table for daily totals is expensive.

-- Inefficient real‑time aggregation
SELECT DATE(created_at) AS day, COUNT(*) AS order_count, SUM(total_amount) AS total_amount FROM orders WHERE created_at >= '2024-01-01' GROUP BY DATE(created_at);

-- Solution 1: summary table updated periodically (event or cron)
CREATE TABLE orders_daily_summary (
    stat_date DATE PRIMARY KEY,
    order_count INT NOT NULL DEFAULT 0,
    total_amount DECIMAL(15,2) NOT NULL DEFAULT 0,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
);

-- Periodic aggregation
INSERT INTO orders_daily_summary (stat_date, order_count, total_amount)
SELECT DATE(created_at), COUNT(*), SUM(total_amount) FROM orders WHERE DATE(created_at) = '2024-01-15' GROUP BY DATE(created_at)
ON DUPLICATE KEY UPDATE order_count = VALUES(order_count), total_amount = VALUES(total_amount);

-- Query the summary table
SELECT * FROM orders_daily_summary WHERE stat_date >= '2024-01-01';

6.3 Case 3 – Fuzzy Search Optimization

-- Inefficient leading‑wildcard LIKE
SELECT * FROM users WHERE name LIKE '%zhang%';

-- Solution 1: Full‑text index (MySQL 5.6+)
ALTER TABLE users ADD FULLTEXT INDEX ft_users_name(name);
SELECT * FROM users WHERE MATCH(name) AGAINST('zhang');

-- Solution 2: External search engine (e.g., Elasticsearch) for complex fuzzy search.

-- Solution 3: Prefix index when searches are prefix‑based
CREATE INDEX idx_users_name_prefix ON users(name(10));
SELECT * FROM users WHERE name LIKE 'zhang%';  -- uses the prefix index

7. Monitoring and Prevention

7.1 Continuous Slow‑Query Monitoring

#!/bin/bash
# Daily analysis of yesterday's slow log
DATE=$(date -d "yesterday" +%Y-%m-%d)
SLOW_LOG="/var/log/mysql/slow.log"
REPORT="/var/log/mysql/slow_query_report_${DATE}.txt"

pt-query-digest --since "$(date -d 'yesterday 00:00:00' +%s) seconds" \
    --until "$(date -d 'yesterday 23:59:59' +%s) seconds" \
    --report-format=query_report $SLOW_LOG > $REPORT

# Alert if more than 10 slow queries were found
if [ -s "$REPORT" ]; then
    count=$(grep -c "Query" $REPORT || true)
    if [ "$count" -gt 10 ]; then
        echo "Found $count slow queries in the report" | mail -s "Slow Query Alert" [email protected]
    fi
fi

7.2 Using Performance Schema

-- Enable statement instruments
UPDATE performance_schema.setup_instruments SET ENABLED = 'YES' WHERE NAME LIKE 'statement/%';

-- Enable consumers
UPDATE performance_schema.setup_consumers SET ENABLED = 'YES' WHERE NAME LIKE 'events_statements%';

-- Show top slow queries
SELECT DIGEST,
       COUNT_STAR,
       SUM_TIMER_WAIT/1000000000000 AS total_time_sec,
       AVG_TIMER_WAIT/1000000000000 AS avg_time_sec,
       SUM_ROWS_EXAMINED,
       SUM_ROWS_SENT,
       SUBSTR(DIGEST_TEXT,1,100) AS query_sample
FROM performance_schema.events_statements_summary_by_digest
ORDER BY SUM_TIMER_WAIT DESC
LIMIT 10;

7.3 Establishing a Slow‑Query Governance Process

Daily monitoring: check newly logged slow queries.

Root‑cause analysis: run EXPLAIN on each.

Design optimization: choose index changes, rewrite SQL, or tweak configuration.

Deploy and verify: confirm performance improvement.

Archive solutions: document the problem and fix to build a knowledge base.

8. Conclusion

Optimizing MySQL slow queries is a systematic effort that combines log collection, tool‑assisted analysis, execution‑plan inspection, appropriate indexing, SQL rewriting, and server‑parameter tuning. Adding an index alone is not a cure‑all; the correct workflow is to analyze first, decide on the best remedy, then implement. Ops engineers should master the entire chain—from enabling and reading the slow‑query log, using EXPLAIN, creating and evaluating indexes, spotting common inefficient patterns, to establishing continuous monitoring—to keep database performance healthy.

References:

MySQL 8.0 Reference Manual – Optimizing Queries

MySQL 8.0 Reference Manual – EXPLAIN Statement

Percona Blog – pt‑query‑digest documentation

High Performance MySQL, 3rd Edition

SHOW [GLOBAL] STATUS, SHOW VARIABLES documentation

Performance TuningMySQLIndex OptimizationEXPLAINSlow Querypt-query-digestmysqldumpslow
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.