Databases 28 min read

Master MySQL Slow Query Analysis: Proven SQL Optimization Techniques to Boost Performance

This comprehensive guide walks you through diagnosing MySQL slow queries, from identifying root causes and configuring slow‑query logs to applying advanced indexing, query‑rewriting, and monitoring techniques—complete with real‑world case studies that demonstrate how to cut query times from seconds to milliseconds.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Master MySQL Slow Query Analysis: Proven SQL Optimization Techniques to Boost Performance

Database Slow Query Analysis and SQL Optimization Practical Techniques: From Beginner to Expert Performance Tuning Guide

Introduction: A Real Production Incident

At 3 a.m., you are woken up by an urgent call: "Online database response time has spiked to 30 seconds, users are complaining, and the order system is almost down!" This is every ops engineer's nightmare, and a scenario I have personally experienced. The root cause was a seemingly simple SQL query, which after optimization reduced the query time from 30 seconds to 0.3 seconds—a 100× improvement.

In this article I share the practical experience accumulated from handling thousands of database performance issues, helping you systematically master slow‑query analysis and core SQL optimization techniques. Whether you are a newcomer to ops or an experienced engineer, this article provides actionable solutions.

1. The Essence of Slow Queries: Why Does Your Database Slow Down?

1.1 Definition and Impact of Slow Queries

A slow query is an SQL statement whose execution time exceeds a predefined threshold. In MySQL, queries taking longer than 10 seconds are logged as slow queries by default, but in production I often set the threshold to 1 second or lower.

The impact of slow queries is far more serious than it appears. A single slow query can consume massive database resources, leading to connection‑pool exhaustion, increased lock wait, soaring memory usage, and ultimately a system‑wide avalanche. I have seen many cases where an unoptimized SQL completely crippled an e‑commerce platform during a promotion peak.

1.2 Root Causes of Slow Queries

Analyzing tens of thousands of slow‑query logs, I identified five fundamental causes:

Missing appropriate indexes : This accounts for over 60 % of slow‑query problems. A full table scan without an index is like searching a dictionary without a table of contents.

Index loss : Even with an index, improper query writing can cause the index to be ignored, e.g., using functions on indexed columns, implicit type conversion, or NOT/!= operators.

Excessive data volume : As business grows, single‑table row counts can reach tens of millions or even billions. Even with indexes, scanning such large data sets degrades performance.

Lock contention : In high‑concurrency scenarios, multiple transactions competing for the same resource cause lock waiting, manifesting as slow queries.

Hardware resource bottlenecks : CPU, memory, or disk I/O saturation will affect database performance.

1.3 Indicators of Slow Queries

Key metrics to quickly identify slow‑query problems:

CPU usage continuously above 80 %

Database connections approaching the maximum

Disk I/O wait time noticeably increasing

Application response time suddenly lengthening

Slow‑query log file growing rapidly

When these signs appear, immediate slow‑query analysis is required.

2. Slow‑Query Analysis Tools and Methodology

2.1 Enabling and Configuring the Slow‑Query Log

-- View current slow‑query configuration
SHOW VARIABLES LIKE 'slow_query%';
SHOW VARIABLES LIKE 'long_query_time';

-- Dynamically enable slow‑query log
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL slow_query_log_file = '/var/log/mysql/slow.log';
SET GLOBAL long_query_time = 1;  -- set to 1 second
SET GLOBAL log_queries_not_using_indexes = 'ON';  -- log queries that do not use indexes

In production, permanently set these parameters in my.cnf:

[mysqld]
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 1
log_queries_not_using_indexes = 1
log_throttle_queries_not_using_indexes = 10  -- limit recordings per minute

2.2 Using pt‑query‑digest for Deep Analysis

pt‑query‑digest

from Percona Toolkit is my most frequently used tool for analyzing slow‑query logs.

Installation:

# CentOS/RHEL
yum install percona-toolkit

# Ubuntu/Debian
apt-get install percona-toolkit

Basic usage:

# Analyze the entire slow‑query log
pt-query-digest /var/log/mysql/slow.log > slow_analysis.txt

# Analyze only the last hour
pt-query-digest --since '1h' /var/log/mysql/slow.log

# Show top 10 slowest queries
pt-query-digest --limit=10 /var/log/mysql/slow.log

The report includes execution count, total time, average time, lock wait, etc., allowing rapid identification of queries that need optimization.

2.3 Analyzing Execution Plans with EXPLAIN

EXPLAIN

is essential for SQL optimization. It shows how MySQL will execute a statement.

EXPLAIN SELECT u.name, o.order_no, o.amount
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.created_at > '2024-01-01';

Key fields:

type : connection type, from best to worst (system, const, eq_ref, ref, range, index, ALL).

key : the index actually used; NULL means no index.

rows : estimated rows scanned; larger numbers indicate slower queries.

Extra : additional important information (e.g., Using index, Using where, Using temporary, Using filesort).

2.4 Real‑Time Monitoring with Performance Schema

Performance Schema, introduced in MySQL 5.5, provides powerful real‑time performance data.

-- Check if enabled
SHOW VARIABLES LIKE 'performance_schema';

-- View currently executing SQL
SELECT * FROM performance_schema.events_statements_current\G;

-- Top 10 longest‑running SQL
SELECT DIGEST_TEXT, COUNT_STAR as exec_count,
       SUM_TIMER_WAIT/1000000000000 as total_latency_sec,
       AVG_TIMER_WAIT/1000000000000 as avg_latency_sec
FROM performance_schema.events_statements_summary_by_digest
ORDER BY AVG_TIMER_WAIT DESC
LIMIT 10;

3. Practical SQL Optimization Techniques

3.1 Index Optimization Strategies

Index optimization is the core of SQL tuning. Proper indexing can improve performance by hundreds of times.

Creating appropriate indexes

Basic principle: create indexes on columns involved in WHERE, JOIN, ORDER BY, GROUP BY.

-- Single‑column index
CREATE INDEX idx_created_at ON orders(created_at);

-- Composite index (order matters)
CREATE INDEX idx_user_status_created ON orders(user_id, status, created_at);

-- Covering index (includes all columns needed by the query)
CREATE INDEX idx_covering ON orders(user_id, status, amount, created_at);

Best practices

Choose high‑selectivity columns first (selectivity = distinct values / total rows).

Follow the left‑most prefix rule for composite indexes.

Avoid redundant indexes; if (a,b) exists, a separate index on (a) is usually unnecessary.

Limit the number of indexes per table (recommended ≤ 5) to avoid write‑performance penalties.

Identify and clean up unused indexes regularly:

-- Find unused indexes
SELECT s.table_schema, s.table_name, s.index_name
FROM information_schema.statistics s
LEFT JOIN performance_schema.table_io_waits_summary_by_index_usage t
  ON s.table_schema = t.object_schema
 AND s.table_name = t.object_name
 AND s.index_name = t.index_name
WHERE t.count_star IS NULL
  AND s.table_schema NOT IN ('mysql','performance_schema','information_schema')
  AND s.index_name != 'PRIMARY';

3.2 Query Rewriting Techniques

Often, simply rewriting SQL yields huge performance gains.

Avoid SELECT * Never use SELECT * in production because it transfers unnecessary data, prevents index covering, and can break when schema changes.

-- Bad example
SELECT * FROM users WHERE status = 'active';

-- Good example
SELECT id, name, email FROM users WHERE status = 'active';

Prefer JOIN over subqueries

-- Inefficient subquery
SELECT name FROM users WHERE id IN (
  SELECT user_id FROM orders WHERE amount > 1000
);

-- Efficient JOIN
SELECT DISTINCT u.name FROM users u
INNER JOIN orders o ON u.id = o.user_id
WHERE o.amount > 1000;

Use EXISTS instead of IN for large subquery results

-- IN (less efficient when subquery returns many rows)
SELECT * FROM users u WHERE EXISTS (
  SELECT 1 FROM orders o WHERE o.user_id = u.id AND o.status = 'completed'
);

Pagination Optimization

Large offsets kill performance. Use delayed join or cursor pagination.

-- Inefficient pagination with large offset
SELECT * FROM orders ORDER BY id LIMIT 1000000, 20;

-- Delayed join optimization
SELECT o.* FROM orders o
INNER JOIN (
  SELECT id FROM orders ORDER BY id LIMIT 1000000, 20
) t ON o.id = t.id;

-- Cursor pagination (recommended)
SELECT * FROM orders WHERE id > 1000000 ORDER BY id LIMIT 20;

3.3 Transaction and Lock Optimization

In high‑concurrency scenarios, transaction and lock optimization is crucial.

Shorten transaction duration

Long transactions are performance killers. Keep transactions as short as possible.

# Bad example: time‑consuming API call inside transaction
def process_order(order_id):
    with transaction():
        order = get_order(order_id)
        payment_result = call_payment_api(order)  # should be outside
        update_order_status(order_id, payment_result)

# Correct example: move external call out of transaction
def process_order(order_id):
    order = get_order(order_id)
    payment_result = call_payment_api(order)
    with transaction():
        update_order_status(order_id, payment_result)

Avoid lock escalation

Create indexes for UPDATE WHERE clauses to lock only necessary rows.

-- Index for UPDATE
CREATE INDEX idx_status ON orders(status);

UPDATE orders SET processed = 1 WHERE status = 'pending';

Optimistic locking

For infrequent conflicts, add a version column.

ALTER TABLE products ADD COLUMN version INT DEFAULT 0;

UPDATE products SET stock = stock - 1, version = version + 1
WHERE id = 100 AND version = 5;
-- Check affected rows; if 0, retry because version changed

4. Building Performance Monitoring and Alerting Systems

4.1 Comprehensive Monitoring Metrics

Key system‑level metrics:

CPU usage and load average

Memory usage and swap

Disk I/O (IOPS, throughput, latency)

Network traffic and connection count

MySQL‑specific metrics:

QPS and TPS

Slow‑query count and ratio

Connection and thread count

InnoDB buffer‑pool hit rate

Lock wait and deadlock count

Replication lag (if master‑slave)

4.2 Monitoring Stack with Prometheus and Grafana

Prometheus + Grafana is the most popular open‑source monitoring solution.

Install mysqld_exporter to collect MySQL metrics:

# Download exporter
wget https://github.com/prometheus/mysqld_exporter/releases/download/v0.14.0/mysqld_exporter-0.14.0.linux-amd64.tar.gz

# Create monitoring user
CREATE USER 'exporter'@'localhost' IDENTIFIED BY 'password';
GRANT PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter'@'localhost';

# Start exporter
./mysqld_exporter --config.my-cnf=".my.cnf"

Prometheus scrape config:

# prometheus.yml
scrape_configs:
  - job_name: 'mysql'
    static_configs:
      - targets: ['localhost:9104']
        labels:
          instance: 'prod-mysql-01'

Key Grafana dashboards:

MySQL Overview (ID 7362)

MySQL Query Response Time (ID 11226)

MySQL InnoDB Metrics (ID 7365)

4.3 Reasonable Alert Rules

Alert rules should follow "no missed alerts, few false positives".

# Prometheus alert examples
groups:
- name: mysql_alerts
  rules:
  - alert: MySQLDown
    expr: mysql_up == 0
    for: 5m
    annotations:
      summary: "MySQL service down"

  - alert: SlowQueries
    expr: rate(mysql_global_status_slow_queries[5m]) > 10
    for: 5m
    annotations:
      summary: "Too many slow queries"

  - alert: HighConnections
    expr: mysql_global_status_threads_connected / mysql_global_variables_max_connections > 0.8
    for: 5m
    annotations:
      summary: "Connection count near limit"

  - alert: InnoDBBufferPoolHitRate
    expr: rate(mysql_global_status_innodb_buffer_pool_reads[5m]) / rate(mysql_global_status_innodb_buffer_pool_read_requests[5m]) > 0.1
    for: 10m
    annotations:
      summary: "InnoDB buffer pool hit rate too low"

5. Real‑World Case Studies

Case 1: E‑commerce Order Query Optimization

Problem : Order query API response time reached 15 seconds.

SELECT o.*, u.name AS user_name, p.name AS product_name
FROM orders o
LEFT JOIN users u ON o.user_id = u.id
LEFT JOIN order_items oi ON o.id = oi.order_id
LEFT JOIN products p ON oi.product_id = p.id
WHERE o.created_at BETWEEN '2024-01-01' AND '2024-12-31'
  AND o.status IN ('pending','processing','shipped')
  AND u.region = 'North'
ORDER BY o.created_at DESC
LIMIT 20;

Analysis (via EXPLAIN): full table scan on orders, no index used, estimated 5 million rows scanned.

Optimization :

-- Create composite indexes
CREATE INDEX idx_orders_created_status ON orders(created_at, status);
CREATE INDEX idx_users_region ON users(region);

-- Rewrite query to use covering index
SELECT o.*, u.name AS user_name, p.name AS product_name
FROM (
  SELECT id FROM orders
  WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31'
    AND status IN ('pending','processing','shipped')
  ORDER BY created_at DESC
  LIMIT 20
) t
INNER JOIN orders o ON t.id = o.id
LEFT JOIN users u ON o.user_id = u.id AND u.region = 'North'
LEFT JOIN order_items oi ON o.id = oi.order_id
LEFT JOIN products p ON oi.product_id = p.id
ORDER BY o.created_at DESC;

Result : query time dropped from 15 seconds to 0.2 seconds (≈ 75× improvement).

Case 2: User Points Ranking Optimization

Problem : Points ranking query took over 30 seconds.

SELECT user_id,
       SUM(points) AS total_points,
       COUNT(*) AS transaction_count
FROM point_transactions
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
GROUP BY user_id
ORDER BY total_points DESC
LIMIT 100;

Analysis : point_transactions holds 200 million rows; query scans ~30 million rows each time.

Optimization :

-- Create summary table (updated hourly)
CREATE TABLE user_points_summary (
    user_id INT PRIMARY KEY,
    total_points DECIMAL(10,2),
    transaction_count INT,
    last_30days_points DECIMAL(10,2),
    last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    INDEX idx_last30_points (last_30days_points DESC)
);

-- Hourly upsert
INSERT INTO user_points_summary (user_id, last_30days_points, transaction_count)
SELECT user_id,
       SUM(points),
       COUNT(*)
FROM point_transactions
WHERE created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
GROUP BY user_id
ON DUPLICATE KEY UPDATE
    last_30days_points = VALUES(last_30days_points),
    transaction_count = VALUES(transaction_count);

-- Query from summary table
SELECT user_id,
       last_30days_points AS total_points,
       transaction_count
FROM user_points_summary
ORDER BY last_30days_points DESC
LIMIT 100;

Result : query time reduced from 30 seconds to 0.01 seconds (≈ 3000× improvement).

6. Best‑Practice Summary for Performance Optimization

6.1 Establish a Performance Baseline

Before any optimization, record key baseline metrics:

Average and peak QPS

Slow‑query ratio (target < 0.1 %)

Average response time, P95, P99

InnoDB buffer‑pool hit rate (target > 95 %)

6.2 Prioritize Optimizations

Not every slow query needs immediate fixing. Prioritize by:

Total consumption = execution frequency × average time.

Impact on core business flows.

Low‑effort, high‑gain "quick wins".

6.3 Code Review Mechanism

Implement SQL review before deployment:

All new SQL must provide EXPLAIN output.

Prohibit SELECT * in production.

Use pt‑online‑schema‑change for DDL on large tables.

Batch large operations to avoid long‑running locks.

6.4 Continuous Optimization Process

Performance tuning is ongoing:

Weekly slow‑query log analysis to spot new issues.

Monthly index usage audit.

Quarterly evaluation of sharding or partitioning needs.

Maintain a knowledge base to avoid repeated pitfalls.

Conclusion: Ongoing Learning and Practice

Database performance optimization is a discipline that requires continuous practice and experience accumulation. No one‑size‑fits‑all solution exists; each system has its own characteristics. As ops engineers, we should stay aware of new MySQL features, build a personal case library, collaborate closely with developers to prevent performance problems at the source, and regularly participate in technical exchanges to learn from others' experiences.

Remember, performance optimization never ends, but with the right methodology and tools you can confidently tackle any challenge. I hope this article helps you advance further on the path of database optimization.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

monitoringindexingmysqlSQL Optimizationslow-query
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.