Master MySQL Slow Query Fixes: From Index Tuning to Distributed Sharding
This comprehensive guide walks MySQL administrators through diagnosing slow queries, configuring the slow‑query log, leveraging EXPLAIN, optimizing indexes, rewriting joins, implementing pagination tricks, and scaling with read‑write splitting, master‑slave replication, and horizontal sharding to dramatically improve performance in high‑traffic production environments.
MySQL Slow Query Optimization: From Index Tuning to Distributed Sharding
Operations veteran’s hard‑earned summary: from single‑table millions‑of‑rows performance hell to a dazzling distributed architecture, this article walks you through the complete MySQL optimization path.
Prelude: Common Slow‑Query Pitfalls
As a seasoned operations engineer, I have witnessed many production incidents caused by slow queries, such as midnight outage with 1000+ connections, pre‑Double‑Eleven urgent tuning, and post‑feature‑launch performance snowball.
Slow‑Query Diagnosis: Equip the Right Tools
1. Enable and Analyze the Slow‑Query Log
First, turn on MySQL’s slow‑query log to capture performance problems:
-- Enable slow query log
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 2;
SET GLOBAL log_queries_not_using_indexes = 'ON';
-- View slow‑query settings
SHOW VARIABLES LIKE '%slow_query%';Production tips: long_query_time set to 2 seconds captures issues without generating excessive logs.
Use mysqldumpslow for log analysis:
mysqldumpslow -s c -t 10 /var/lib/mysql/slow.log2. Real‑Time Monitoring with Performance Schema
Performance Schema is a powerful monitoring tool in MySQL 5.6+:
-- Enable Performance Schema
UPDATE performance_schema.setup_instruments SET ENABLED='YES', TIMED='YES' WHERE NAME LIKE '%statement/%';
-- Retrieve top‑10 slowest queries
SELECT DIGEST_TEXT, COUNT_STAR, AVG_TIMER_WAIT/1000000000 AS avg_time_sec,
MAX_TIMER_WAIT/1000000000 AS max_time_sec
FROM performance_schema.events_statements_summary_by_digest
ORDER BY AVG_TIMER_WAIT DESC
LIMIT 10;3. Deep Dive with EXPLAIN
EXPLAIN is essential for every DBA, but many only glance at the output. The key fields are: type: access type (system > const > eq_ref > ref > range > index > ALL). key: index actually used. rows: estimated rows scanned. filtered: filter percentage. Extra: look for “Using filesort” or “Using temporary”.
EXPLAIN FORMAT=JSON
SELECT o.order_id, u.username, p.product_name
FROM orders o
JOIN users u ON o.user_id = u.user_id
JOIN products p ON o.product_id = p.product_id
WHERE o.created_at > '2024-01-01'
AND o.status = 'completed';Index Optimization in Practice
1. Single‑Table Index Strategies
The left‑most prefix rule of composite indexes is a frequent interview question and a common source of mistakes.
-- Create a composite index
ALTER TABLE orders ADD INDEX idx_user_status_time (user_id, status, created_at);
-- Queries that can use the index
SELECT * FROM orders WHERE user_id = 123;
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending';
SELECT * FROM orders WHERE user_id = 123 AND status = 'pending' AND created_at > '2024-01-01';
-- Queries that cannot use the index
SELECT * FROM orders WHERE status = 'pending';
SELECT * FROM orders WHERE created_at > '2024-01-01';Covering Indexes
Covering indexes avoid the need for a table lookup, dramatically improving performance:
-- Original query (requires table lookup)
SELECT user_id, status, created_at FROM orders WHERE user_id = 123;
-- Optimized covering index
ALTER TABLE orders ADD INDEX idx_cover (user_id, status, created_at);
-- Now the query can be satisfied entirely from the index.2. Multi‑Table JOIN Optimization
Driver Table Selection
-- Bad join order (large driving table)
SELECT * FROM orders o JOIN users u ON o.user_id = u.user_id WHERE u.city = 'Beijing';
-- Good join order (let MySQL choose a small driving table)
SELECT /*+ USE_INDEX(u, idx_city) */ * FROM users u JOIN orders o ON u.user_id = o.user_id WHERE u.city = 'Beijing';Subquery vs JOIN
-- Poor performance subquery
SELECT * FROM orders WHERE user_id IN (SELECT user_id FROM users WHERE city = 'Shanghai');
-- Better JOIN
SELECT DISTINCT o.* FROM orders o JOIN users u ON o.user_id = u.user_id WHERE u.city = 'Shanghai';3. Common Index‑Loss Traps
Functions on Columns
-- Index disabled by function
SELECT * FROM orders WHERE DATE(created_at) = '2024-01-01';
SELECT * FROM orders WHERE UPPER(status) = 'PENDING';
-- Index‑friendly version
SELECT * FROM orders WHERE created_at >= '2024-01-01' AND created_at < '2024-01-02';
SELECT * FROM orders WHERE status = 'PENDING';Data‑Type Mismatch
-- Wrong type
SELECT * FROM orders WHERE user_id = '123';
-- Correct type
SELECT * FROM orders WHERE user_id = 123;SQL Rewrite and Optimization Techniques
1. Pagination Optimization
-- Bad deep pagination
SELECT * FROM orders ORDER BY created_at LIMIT 100000, 20;
-- Subquery rewrite
SELECT * FROM orders o
JOIN (SELECT order_id FROM orders ORDER BY created_at LIMIT 100000, 20) t ON o.order_id = t.order_id;
-- Cursor‑based pagination (recommended)
SELECT * FROM orders WHERE order_id > 1000000 ORDER BY order_id LIMIT 20;2. COUNT Query Optimization
-- Full table scan
SELECT COUNT(*) FROM orders WHERE status = 'pending';
-- Use index
ALTER TABLE orders ADD INDEX idx_status (status);
-- Or approximate with information_schema
SELECT table_rows FROM information_schema.TABLES WHERE table_name = 'orders';3. Batch Operations
-- Inefficient row‑by‑row inserts
INSERT INTO orders (user_id, product_id, amount) VALUES (1,100,99.99);
INSERT INTO orders (user_id, product_id, amount) VALUES (2,101,199.99);
-- ...
-- Efficient batch insert
INSERT INTO orders (user_id, product_id, amount) VALUES
(1,100,99.99),
(2,101,199.99),
(3,102,299.99);Architecture‑Level Optimizations: Read‑Write Splitting and Replication
1. Master‑Slave Replication Setup
Master configuration:
[mysqld]
server-id = 1
log-bin = mysql-bin
binlog-format = ROW
sync_binlog = 1
innodb_flush_log_at_trx_commit = 1Slave configuration:
[mysqld]
server-id = 2
relay-log = mysql-relay-bin
read_only = 1Application‑level read‑write splitting (Python example):
class DatabaseRouter:
def __init__(self):
self.master = MySQLConnection('master_host')
self.slaves = [MySQLConnection('slave1_host'), MySQLConnection('slave2_host')]
def get_connection(self, is_write=False):
if is_write:
return self.master
else:
return random.choice(self.slaves)
@read_from_slave
def get_user_orders(user_id):
return db.query("SELECT * FROM orders WHERE user_id = %s", user_id)
@write_to_master
def create_order(order_data):
return db.execute("INSERT INTO orders (...) VALUES (...)", order_data)2. Master‑Slave Lag Monitoring
SHOW SLAVE STATUS\G
-- Important fields:
-- Seconds_Behind_Master, Slave_IO_Running, Slave_SQL_RunningSolutions include parallel replication ( slave_parallel_workers), semi‑synchronous replication, and forcing critical reads to the master.
Distributed Database Sharding Strategies
1. Vertical Sharding
Separate databases by business domain:
-- Original monolithic schema
database: ecommerce
├── users
├── orders
├── products
├── payments
├── inventory
└── logs
-- After vertical sharding
database: user_service → users
database: order_service → orders, order_items
database: product_service→ products, categories
database: payment_service→ payments2. Horizontal Sharding
Time‑Based Partitioning
-- Create monthly tables
CREATE TABLE orders_202401 LIKE orders;
CREATE TABLE orders_202402 LIKE orders;
-- Routing logic (pseudo‑code)
def get_table_name(date):
month = date.strftime('%Y_%m')
return f"orders_{month}"Hash‑Based Partitioning by User ID
-- Create 16 shards
CREATE TABLE orders_00 LIKE orders;
CREATE TABLE orders_01 LIKE orders;
-- ...
CREATE TABLE orders_15 LIKE orders;
-- Routing algorithm
def get_table_name(user_id):
shard_id = user_id % 16
return f"orders_{shard_id:02d}"3. Middleware Selection
ShardingSphere configuration example:
rules:
- !SHARDING
tables:
t_order:
actualDataNodes: ds_${0..1}.t_order_${0..15}
tableStrategy:
standard:
shardingColumn: order_id
shardingAlgorithmName: t_order_inline
keyGenerateStrategy:
column: order_id
keyGeneratorName: snowflake
shardingAlgorithms:
t_order_inline:
type: INLINE
props:
algorithm-expression: t_order_${order_id%16}Mycat XML snippet (attributes removed):
<table name="orders" primaryKey="order_id" dataNode="dn1,dn2,dn3,dn4" rule="mod-long"/>
<childTable name="order_items" primaryKey="item_id" joinKey="order_id" parentKey="order_id"/>
<function name="mod-long" class="io.mycat.route.function.PartitionByMod">
<property name="count">4</property>
</function>4. Cross‑Database Queries
Distributed transaction with Seata:
@GlobalTransactional
public void createOrderWithPayment(OrderDTO order, PaymentDTO payment) {
orderService.createOrder(order);
paymentService.processPayment(payment);
inventoryService.reduceStock(order.getProductId(), order.getQuantity());
}Data aggregation across shards (Python example):
class OrderAnalysisService:
def get_user_order_summary(self, user_id):
futures = []
with ThreadPoolExecutor(max_workers=4) as executor:
for shard in self.get_user_shards(user_id):
futures.append(executor.submit(self.query_shard, shard, user_id))
results = []
for future in futures:
results.extend(future.result())
return self.merge_results(results)Performance Monitoring and Alerting
1. Key Metrics
-- QPS monitoring
SELECT VARIABLE_NAME, VARIABLE_VALUE / (SELECT VARIABLE_VALUE FROM INFORMATION_SCHEMA.GLOBAL_STATUS WHERE VARIABLE_NAME='Uptime') AS per_second
FROM INFORMATION_SCHEMA.GLOBAL_STATUS
WHERE VARIABLE_NAME IN ('Com_select','Com_insert','Com_update','Com_delete');
-- Connection count
SHOW STATUS LIKE 'Threads_connected';
SHOW STATUS LIKE 'Max_used_connections';
-- InnoDB status
SHOW ENGINE INNODB STATUS;2. Prometheus + Grafana Dashboard
# mysqld_exporter metrics
mysql_up
mysql_global_status_threads_connected
mysql_global_status_slow_queries
mysql_global_status_queries
mysql_slave_lag_seconds3. Automated Alert Rules
# Alert for excessive slow queries
groups:
- name: mysql
rules:
- alert: MySQLSlowQueries
expr: rate(mysql_global_status_slow_queries[5m]) > 0.1
labels:
severity: warning
annotations:
summary: "MySQL slow queries too high"
- alert: MySQLConnectionsHigh
expr: mysql_global_status_threads_connected / mysql_global_variables_max_connections > 0.8
labels:
severity: critical
annotations:
summary: "MySQL connection count too high"Real‑World Case: E‑Commerce Order System Optimization
Background
An e‑commerce platform’s order table grew to 50 million rows, causing query timeouts.
Problem Diagnosis
Slow‑query log revealed a full‑table scan on a query joining orders, users, and products with a date range filter.
EXPLAIN showed no usable indexes; only the primary key existed.
Optimization Steps
Phase 1 – Index Tuning
ALTER TABLE orders ADD INDEX idx_status_created (status, created_at);
ALTER TABLE orders ADD INDEX idx_user_created (user_id, created_at);
-- Query time dropped from 30 s to ~100 msPhase 2 – Query Rewrite
SELECT o.order_id, o.user_id, o.product_id, o.amount, o.status, o.created_at,
u.username, p.product_name
FROM (
SELECT order_id, user_id, product_id, amount, status, created_at
FROM orders
WHERE status IN ('pending','processing')
AND created_at BETWEEN '2024-01-01' AND '2024-01-31'
ORDER BY created_at DESC
LIMIT 20
) o
LEFT JOIN users u ON o.user_id = u.user_id
LEFT JOIN products p ON o.product_id = p.product_id;Phase 3 – Sharding Implementation
# Table naming: orders_2024_01_0 … orders_2024_01_15
def get_table_name(user_id, created_at):
month = created_at.strftime('%Y_%m')
shard = user_id % 16
return f"orders_{month}_{shard}"Results
Metric
Before
After
Improvement
Query response time
30 s
50 ms
99.8 %
QPS
10
500
5000 %
CPU usage
80 %
20 %
75 %
Memory usage
90 %
40 %
55 %
Takeaways and Best Practices
Optimization Pyramid
Distributed Architecture
/ \
Sharding Read‑Write Splitting
/ \ / \
Indexing Query Tuning Master‑Slave Caching
/ | \ / | \ / \ / \
Single Composite Covering Pagination Monitoring Redis/MemcachedChecklist
Enable slow‑query log and analyze with mysqldumpslow.
Monitor with Performance Schema.
Run EXPLAIN on critical statements.
Inspect server resource usage.
Index Guidelines
Create indexes that match WHERE conditions.
Order columns in composite indexes by selectivity.
Use covering indexes to avoid back‑table lookups.
Remove unused indexes.
Query Guidelines
Avoid SELECT *; fetch only needed columns.
Prefer JOIN over subqueries; choose the right driver table.
Rewrite deep pagination with keyset pagination.
Limit result sets with LIMIT.
Architecture Guidelines
Implement read‑write splitting.
Consider sharding when data volume exceeds a few hundred million rows.
Deploy a caching layer (Redis, Memcached).
Set up monitoring and alerting.
Common Mistakes to Avoid
Over‑indexing : each index adds write overhead.
Ignoring data skew : ensure sharding keys distribute data evenly.
Cache over‑reliance : database performance must be solid.
Blind sharding : small tables don’t need horizontal partitioning.
Final Thoughts
MySQL slow‑query optimization is a systematic engineering effort that requires attention at the query, index, and architecture levels. As operations engineers, we must not only fix current bottlenecks but also design scalable, maintainable systems for the future.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
