Master MySQL Slow Query Optimization: Proven Methods & Pitfall Guide
This comprehensive guide walks you through a systematic methodology for diagnosing and fixing MySQL slow queries, covering proper log configuration, EXPLAIN analysis, index design, query rewriting, table restructuring, and practical caching techniques, while highlighting common misconceptions and real‑world case studies.
Introduction
"Again this query! Optimized three times, still slow?" The author, a DBA, shares a three‑year journey of systematic slow‑query optimization that reduced P99 query time from 2 seconds to 200 ms and cut slow‑query count by 90%.
Technical Background: What Is a Slow Query?
MySQL’s default slow‑query threshold is 10 seconds, which is unsuitable for production. Reasonable thresholds depend on query type:
Reasonable slow‑query definition:
Primary‑key lookup: acceptable <10 ms, slow >50 ms.
Regular range query: acceptable <100 ms, slow >500 ms.
Aggregate query: acceptable <1 s, slow >5 s.
Report query: acceptable <10 s, slow >60 s.
Batch operation: acceptable <30 s, slow >300 s.
Root Causes of Slow Queries
Doing too much useless work – scanning massive unnecessary data.
Long wait times – lock wait, I/O wait, network wait.
All slow‑query optimization ultimately solves these two problems.
Common Misconceptions
Misconception 1: Every slow query needs an index. Truth: Blindly adding indexes can worsen performance and increase storage and write overhead.
Misconception 2: More indexes are better. Truth: A table should have no more than five indexes (except special cases); excess indexes slow writes and increase optimizer cost.
Misconception 3: EXPLAIN type=ALL always means slow. Truth: For tiny tables, a full table scan can be faster than using an index.
Misconception 4: Optimization only looks at SQL, not business logic. Truth: Changing business requirements (e.g., adding pagination) often yields the biggest gains.
Case Study: System Background
System architecture
Database: MySQL 8.0.32, master‑slave.
Server: 16 CPU 64 GB RAM, SSD.
Data volume: 20 million rows in core tables, total 500 GB.
Load: 10 million queries per day, peak QPS 5000.
Initial problems
Slow‑query count: >8000 per day.
P99 response time: 2.3 s.
Slow‑query proportion: 0.08 %.
Key pain points: order queries, user statistics, report generation.
Core Content: Systematic Slow‑Query Optimization Methodology
Step 1: Identify and Analyze Slow Queries
Configure Slow Query Log
-- View current slow‑query settings
SHOW VARIABLES LIKE 'slow%';
SHOW VARIABLES LIKE 'long_query_time';
-- Enable slow query log
SET GLOBAL slow_query_log = 'ON';
-- Set slow‑query threshold (recommended 1 s)
SET GLOBAL long_query_time = 1;
-- Log queries not using indexes
SET GLOBAL log_queries_not_using_indexes = 'ON';
-- Throttle logging of such queries
SET GLOBAL log_throttle_queries_not_using_indexes = 10;
-- Persist in /etc/my.cnf
[mysqld]
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 1
log_queries_not_using_indexes = 1
log_throttle_queries_not_using_indexes = 10Analyze Slow Query Log (recommended: pt‑query‑digest)
# Install Percona Toolkit
yum install percona-toolkit
# Digest top 20 slow queries
pt-query-digest --limit 20 /var/log/mysql/slow.log > slow_query_report.txt
# View report
less slow_query_report.txtReport example:
# Query 1: 234.5 QPS, 12.3s avg, ID 0x123ABC...
# Calls: 20842 Total time: 256341s Mean: 12.3s
# Rows sent: 4168400 Rows examined: 208420000
SELECT o.order_id, o.user_id, o.total_amount, u.username
FROM orders o
LEFT JOIN users u ON o.user_id = u.user_id
WHERE o.status = 'pending'
AND o.create_time > '2024-01-01'
ORDER BY o.create_time DESC
LIMIT 20\GKey interpretation:
Calls : execution count – prioritize high‑frequency queries.
Total time : overall impact on performance.
Rows examined : high ratio of examined to sent rows signals need for optimization.
Build Slow‑Query Priority
High priority (immediate)
Core business impact (order, payment, login).
High frequency (≥10 per minute).
Single execution >5 s.
Causes lock or deadlock.
Medium priority (plan)
Medium frequency (10–100 per hour).
Non‑core business impact.
Can be mitigated by cache or async processing.
Low priority (defer)
Very low frequency (<10 per day).
Background statistics or reports.
Can run in off‑peak windows.
Step 2: Deep EXPLAIN Analysis
EXPLAIN is the core tool, but looking only at the type column is insufficient.
EXPLAIN column details system: single‑row system table – fastest. const: primary‑key or unique index equality – very fast. eq_ref: unique index join – usually in JOIN. ref: non‑unique index equality. range: index range scan (>, <, BETWEEN, IN). index: full index scan. ALL: full table scan – slowest.
Optimization suggestions const, eq_ref, ref: usually fine. range: acceptable, watch scanned rows. index, ALL: need improvement unless table is tiny.
Key columns
type : access type, from good to bad.
key : actual index used.
rows : estimated rows scanned – crucial metric.
Extra : additional info (e.g., Using filesort, Using temporary are performance killers).
EXPLAIN ANALYZE (MySQL 8.0+) provides real execution times and loop counts, allowing comparison between estimated cost and actual time.
Step 3: Proper Index Optimization
Golden Rules for Index Design
Rule 1: Leftmost Prefix
Composite index (a, b, c) speeds queries that filter on a, a + b, or a + b + c.
Cannot speed queries that skip earlier columns (e.g., WHERE b = ?).
Rule 2: Column Order Matters
High‑cardinality columns first.
Equality columns before range columns.
Frequently queried columns first.
Rule 3: Covering Indexes Are Best
-- Query only needs id and name
SELECT user_id, username FROM users WHERE status = 'active';
-- Create covering index
CREATE INDEX idx_status_id_name ON users(status, user_id, username);
-- EXPLAIN shows Extra: Using indexPractical Example: Order Query Optimization
Original slow query
-- Query pending orders for a user
SELECT order_id, total_amount, create_time
FROM orders
WHERE user_id = 12345
AND status = 'pending'
ORDER BY create_time DESC
LIMIT 20;EXPLAIN analysis
type: ref
key: idx_user_id
rows: 50000 -- scans 50 k rows, takes 850 ms
Extra: Using where; Using filesortOptimization 1: Composite Index
CREATE INDEX idx_user_status_time ON orders(user_id, status, create_time);
DROP INDEX idx_user_id ON orders;
DROP INDEX idx_status ON orders;After re‑EXPLAIN:
type: range
key: idx_user_status_time
rows: 234 -- scans only 234 rows
Extra: Using index condition -- no filesortPerformance comparison:
Before: 50000 rows scanned, 850 ms.
After: 234 rows scanned, 12 ms (≈70× faster).
Optimization 2: Covering Index
CREATE INDEX idx_user_status_time_amount ON orders(user_id, status, create_time, total_amount);EXPLAIN shows Extra: Using index – no table lookup needed.
When Not to Add Indexes
Scenario 1: Frequently Updated Columns
CREATE INDEX idx_update_time ON orders(update_time);
-- Every UPDATE must also update this index, hurting write performance.Scenario 2: Low‑Cardinality Columns
CREATE INDEX idx_gender ON users(gender);
-- Gender has ~50 % selectivity; full scan may be faster.Assess cardinality:
SELECT COUNT(DISTINCT status)/COUNT(*) AS selectivity FROM orders;
-- Recommend: cardinality <10 % → avoid index; >80 % → add index.Step 4: Query Rewrite and Optimization Techniques
Avoid SELECT *
-- Bad
SELECT * FROM orders WHERE order_id = 12345;
-- Good
SELECT order_id, total_amount, status FROM orders WHERE order_id = 12345;Reasons: reduces I/O, enables covering indexes, lowers network payload.
Avoid Functions on Indexed Columns
-- Bad
SELECT * FROM orders WHERE DATE(create_time) = '2024-01-01';
-- Good
SELECT * FROM orders WHERE create_time >= '2024-01-01 00:00:00'
AND create_time < '2024-01-02 00:00:00';Avoid Implicit Type Conversion
-- Bad (string forces conversion)
SELECT * FROM orders WHERE user_id = '12345';
-- Good
SELECT * FROM orders WHERE user_id = 12345;Optimize OR Conditions
-- Slow
SELECT * FROM orders WHERE user_id = 12345 OR order_id = 67890;
-- Faster (UNION)
SELECT * FROM orders WHERE user_id = 12345
UNION
SELECT * FROM orders WHERE order_id = 67890;
-- Or use IN when same column
SELECT * FROM orders WHERE user_id IN (12345, 67890);Optimize Pagination
-- Deep pagination (slow)
SELECT * FROM orders ORDER BY create_time DESC LIMIT 1000000, 20;
-- Cursor‑based pagination
SELECT * FROM orders ORDER BY create_time DESC LIMIT 20;
-- Remember last create_time, then:
SELECT * FROM orders WHERE create_time < '2024-01-15 10:30:00'
ORDER BY create_time DESC LIMIT 20;Optimize COUNT Queries
-- Slow full count
SELECT COUNT(*) FROM orders WHERE status = 'pending';
-- Approximate using InnoDB stats
SELECT VARIABLE_VALUE FROM information_schema.INNODB_TABLESTATS WHERE TABLE_NAME='orders';
-- Maintain a separate counter table or use index on status.Step 5: Table Structure Optimization
Issue 1: Wide tables (many columns)
CREATE TABLE users (
user_id INT PRIMARY KEY,
username VARCHAR(50),
email VARCHAR(100),
-- ... 98 more columns ...
last_login_time DATETIME
);Solution: vertical partitioning – split core fields and profile fields into separate tables.
CREATE TABLE users (
user_id INT PRIMARY KEY,
username VARCHAR(50),
email VARCHAR(100),
status TINYINT,
create_time DATETIME
);
CREATE TABLE user_profile (
user_id INT PRIMARY KEY,
real_name VARCHAR(100),
phone VARCHAR(20),
address TEXT,
FOREIGN KEY (user_id) REFERENCES users(user_id)
);Issue 2: Very large tables (billions of rows)
Solution 1: Partitioned tables (e.g., yearly partitions).
CREATE TABLE orders (
order_id BIGINT PRIMARY KEY,
user_id INT,
create_time DATETIME,
-- other columns ...
) PARTITION BY RANGE (YEAR(create_time)) (
PARTITION p2022 VALUES LESS THAN (2023),
PARTITION p2023 VALUES LESS THAN (2024),
PARTITION p2024 VALUES LESS THAN (2025)
);Solution 2: Archive historical data to a separate table and query with UNION when needed.
CREATE TABLE orders_history LIKE orders;
INSERT INTO orders_history SELECT * FROM orders WHERE create_time < DATE_SUB(NOW(), INTERVAL 1 YEAR);
DELETE FROM orders WHERE create_time < DATE_SUB(NOW(), INTERVAL 1 YEAR);Issue 3: Improper column types
-- Bad definitions
user_id VARCHAR(50), -- actually numeric
status VARCHAR(20), -- few enum values
amount DOUBLE, -- monetary value
-- Better definitions
user_id BIGINT,
status TINYINT,
amount DECIMAL(10,2);Recommendations summary:
Primary‑key IDs → BIGINT.
Status/enum → TINYINT.
Amount → DECIMAL(10,2).
Date/time → DATETIME/TIMESTAMP.
Phone numbers → VARCHAR(20).
IP addresses → INT UNSIGNED.
Step 6: Other Optimization Techniques
Use Cache to Reduce DB Load
function getUserInfo(userId) {
// Check Redis first
let info = redis.get('user:' + userId);
if (info !== null) return info;
// Miss – query MySQL
info = mysql.query('SELECT * FROM users WHERE user_id = ?', userId);
// Store in cache for 1 hour
redis.setex('user:' + userId, 3600, info);
return info;
}Read/Write Splitting
-- Writes go to master
INSERT INTO orders (...) VALUES (...);
UPDATE orders SET status='paid' WHERE order_id=12345;
-- Reads go to replica (allowing slight lag)
SELECT * FROM orders WHERE user_id=12345;
-- Critical reads still go to master
SELECT * FROM orders WHERE order_id=12345 FOR UPDATE;Use Connection Pool
// Without pool – high overhead
Connection conn = DriverManager.getConnection(url, user, password);
... // use
conn.close();
// With HikariCP pool
HikariConfig config = new HikariConfig();
config.setJdbcUrl(url);
config.setUsername(user);
config.setPassword(password);
config.setMaximumPoolSize(20);
config.setMinimumIdle(5);
config.setConnectionTimeout(30000);
HikariDataSource ds = new HikariDataSource(config);
Connection conn = ds.getConnection();Practical Case: E‑commerce Order Query from 3 s to 50 ms
Problem Background
Average response: 1.2 s.
P99 response: 3.5 s.
Peak QPS: 500.
Slow queries: >80 per minute.
Original SQL
SELECT o.order_id, o.order_no, o.total_amount, o.status, o.create_time,
GROUP_CONCAT(p.product_name) AS products,
GROUP_CONCAT(oi.quantity) AS quantities
FROM orders o
LEFT JOIN order_items oi ON o.order_id = oi.order_id
LEFT JOIN products p ON oi.product_id = p.product_id
WHERE o.user_id = ?
AND o.status IN ('pending','paid','shipped')
AND o.create_time >= DATE_SUB(NOW(), INTERVAL 6 MONTH)
GROUP BY o.order_id
ORDER BY o.create_time DESC
LIMIT 20;First Analysis: EXPLAIN
id: 1 type: ref key: idx_user_id rows: 12000 Extra: Using where; Using temporary; Using filesortFindings:
Scans 12 k orders, returns only 20.
Uses temporary table and filesort (GROUP_CONCAT & ORDER BY).
Three‑table join adds complexity.
Second Optimization: Index Optimization
CREATE INDEX idx_user_status_time ON orders(user_id, status, create_time);
EXPLAIN shows rows reduced to 580 but still temporary/filesort.Effect: average latency ↓ from 1.2 s to 600 ms.
Third Optimization: Query Rewrite
Problem: GROUP_CONCAT forces temporary table.
Solution: split into two queries – first fetch order list, then batch fetch items.
-- Query 1: order list (fast)
SELECT order_id, order_no, total_amount, status, create_time
FROM orders
WHERE user_id = ?
AND status IN ('pending','paid','shipped')
AND create_time >= DATE_SUB(NOW(), INTERVAL 6 MONTH)
ORDER BY create_time DESC
LIMIT 20;
-- Query 2: items for those orders
SELECT oi.order_id, p.product_name, oi.quantity
FROM order_items oi
JOIN products p ON oi.product_id = p.product_id
WHERE oi.order_id IN (list of 20 ids)
ORDER BY oi.order_id, oi.item_id;EXPLAIN shows type: range, rows ≈ 20 for orders and ≈ 60 for items.
Effect: average latency ↓ to 120 ms, P99 ↓ to 350 ms.
Fourth Optimization: Cache Hot Data
public List<Order> getUserOrders(Long userId, int page, int size) {
String cacheKey = "user:orders:" + userId + ":" + page;
List<Order> orders = redis.get(cacheKey);
if (orders != null) return orders;
orders = orderDao.queryUserOrders(userId, page, size);
redis.setex(cacheKey, 300, orders); // 5 min TTL
return orders;
}
public void updateOrderStatus(Long orderId, String newStatus) {
orderDao.updateStatus(orderId, newStatus);
Long userId = orderDao.getUserIdByOrderId(orderId);
redis.deletePattern("user:orders:" + userId + ":*");
}Effect: cache hit rate 85 %, hit latency 8 ms, overall average 30 ms, P99 150 ms, DB QPS reduced from 500 to 75.
Final Results
Average response time: 1200 ms → 30 ms (97.5 % improvement).
P99 response time: 3500 ms → 150 ms (95.7 % improvement).
Slow queries: 80/min → 2/min (97.5 % reduction).
Database QPS: 500 → 75 (85 % reduction).
Key Takeaways
Master the tools (slow‑query log, EXPLAIN, pt‑query‑digest) before changing anything.
Indexes are essential but not a cure‑all; sometimes query rewrite or caching yields bigger gains.
Data‑driven decisions (EXPLAIN, monitoring) beat intuition.
Understanding business requirements often leads to the best optimizations.
Performance tuning is continuous – monitor, iterate, and document.
Future Trends
Self‑adaptive indexes that create/drop automatically.
AI‑powered optimizers using machine learning.
Columnar storage engines for analytical workloads.
Distributed SQL databases (TiDB, CockroachDB) offering horizontal scalability.
Regardless of emerging technologies, a solid grasp of SQL execution principles and a systematic optimization workflow remain the foundation for high‑performance database systems.
Ops Community
A leading IT operations community where professionals share and grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
