Databases 23 min read

How I Cut MySQL Query Time from 20 Seconds to 200 Milliseconds

In this detailed case study, the author walks through a real‑world MySQL slow‑query incident, analyzes execution plans, logs, and profiling data, and applies a four‑step optimization strategy—including index creation, query rewriting, covering indexes, and materialized views—to reduce execution time from 20 seconds to 200 milliseconds, achieving over 100× speedup.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
How I Cut MySQL Query Time from 20 Seconds to 200 Milliseconds

MySQL Slow Query Optimization: From 20 Seconds to 200 Milliseconds

Introduction: A Production Incident

At 3 AM, an alarm indicated that a core business API response time exceeded 20 seconds, triggering a flood of user complaints. The culprit was a seemingly simple SQL query. After a series of optimizations, the query time dropped from 20 seconds to 200 milliseconds—a 100× performance boost.

1. Problem Context: The Slow Query

The e‑commerce platform needs to aggregate order statistics per merchant in real time. The relevant tables are:

-- orders table (5 million rows)
CREATE TABLE orders (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    order_no VARCHAR(32),
    merchant_id INT,
    user_id INT,
    amount DECIMAL(10,2),
    status TINYINT,
    created_at DATETIME,
    updated_at DATETIME
) ENGINE=InnoDB;

-- order_items table (20 million rows)
CREATE TABLE order_items (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    order_id BIGINT,
    product_id INT,
    quantity INT,
    price DECIMAL(10,2),
    created_at DATETIME
) ENGINE=InnoDB;

-- merchants table (100 000 rows)
CREATE TABLE merchants (
    id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(100),
    category VARCHAR(50),
    city VARCHAR(50),
    level TINYINT
) ENGINE=InnoDB;

The original SQL requested, for the past 30 days, statistics of VIP merchants (level = 3) in Beijing, including total orders, total amount, average order value, etc. The query took 20.34 seconds to execute.

SELECT
    m.id,
    m.name,
    COUNT(DISTINCT o.id) AS order_count,
    COUNT(DISTINCT o.user_id) AS user_count,
    SUM(o.amount) AS total_amount,
    AVG(o.amount) AS avg_amount,
    SUM(oi.quantity) AS total_items
FROM merchants m
LEFT JOIN orders o ON m.id = o.merchant_id
LEFT JOIN order_items oi ON o.id = oi.order_id
WHERE m.city = '北京'
  AND m.level = 3
  AND o.status = 1
  AND o.created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
GROUP BY m.id, m.name
ORDER BY total_amount DESC
LIMIT 100;

2. Problem Analysis: Diagnosis Steps

2.1 Examine the Execution Plan

The EXPLAIN output showed:

+----+-------------+-------+------+---------------+------+---------+------+----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows     | Extra                                        |
+----+-------------+-------+------+---------------+------+---------+------+----------+----------------------------------------------+
| 1  | SIMPLE      | m     | ALL  | NULL          | NULL | NULL    | NULL | 100000   | Using where; Using temporary; Using filesort |
| 1  | SIMPLE      | o     | ALL  | NULL          | NULL | NULL    | NULL | 5000000  | Using where; Using join buffer               |
| 1  | SIMPLE      | oi    | ALL  | NULL          | NULL | NULL    | NULL | 20000000 | Using where; Using join buffer               |
+----+-------------+-------+------+---------------+------+---------+------+----------+----------------------------------------------+

Key issues identified:

All three tables performed full table scans (type=ALL).

No indexes were used (key=NULL).

Temporary tables and filesort were created.

Cartesian‑product‑like row count: 100 000 × 5 000 000 × 20 000 000.

2.2 Slow‑Query Log Analysis

Enabling the slow‑query log revealed that the query examined 25 438 921 rows but returned only 100 rows.

# Time: 2024-03-15T03:25:41.123456Z
# Query_time: 20.342387  Lock_time: 0.000123  Rows_sent: 100  Rows_examined: 25438921
# Rows_affected: 0  Bytes_sent: 15234

2.3 Profile Analysis

Profiling showed the majority of time spent sending data (≈90 %).

Sending data: 18.5 s (90 %).

Creating sort index: 1.2 s (6 %).

Copying to temporary table: 0.6 s (3 %).

3. Optimization Strategy: Four Steps

Step 1 – Add Essential Indexes

-- Index on merchants (city, level)
ALTER TABLE merchants ADD INDEX idx_city_level (city, level);

-- Composite index on orders (merchant_id, status, created_at)
ALTER TABLE orders ADD INDEX idx_merchant_status_created (merchant_id, status, created_at);

-- Index on order_items (order_id)
ALTER TABLE order_items ADD INDEX idx_order_id (order_id);

Result: execution time reduced to 8.5 seconds (≈2.4× faster).

Step 2 – Rewrite SQL to Reduce Join Size

Instead of joining all tables before filtering, filter first and then join.

SELECT
    m.id,
    m.name,
    t.order_count,
    t.user_count,
    t.total_amount,
    t.avg_amount,
    t.total_items
FROM merchants m
INNER JOIN (
    SELECT
        o.merchant_id,
        COUNT(DISTINCT o.id) AS order_count,
        COUNT(DISTINCT o.user_id) AS user_count,
        SUM(o.amount) AS total_amount,
        AVG(o.amount) AS avg_amount,
        SUM(items.item_count) AS total_items
    FROM orders o
    LEFT JOIN (
        SELECT order_id, SUM(quantity) AS item_count
        FROM order_items
        GROUP BY order_id
    ) items ON o.id = items.order_id
    WHERE o.status = 1
      AND o.created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
    GROUP BY o.merchant_id
) t ON m.id = t.merchant_id
WHERE m.city = '北京' AND m.level = 3
ORDER BY t.total_amount DESC
LIMIT 100;

Result: execution time dropped to 2.3 seconds (≈8.8× faster).

Step 3 – Use Covering Indexes

-- Covering index that includes all columns needed for the query
ALTER TABLE orders ADD INDEX idx_covering (merchant_id, status, created_at, id, user_id, amount);

Result: execution time reduced to 0.8 seconds (≈25.4× faster).

Step 4 – Materialized View (Summary Table)

For reporting queries that can tolerate slight data latency, a pre‑aggregated summary table provides the best performance.

CREATE TABLE merchant_order_summary (
    merchant_id INT,
    summary_date DATE,
    order_count INT,
    user_count INT,
    total_amount DECIMAL(10,2),
    avg_amount DECIMAL(10,2),
    total_items INT,
    PRIMARY KEY (merchant_id, summary_date),
    INDEX idx_date (summary_date)
) ENGINE=InnoDB;

-- Hourly job to refresh the summary
INSERT INTO merchant_order_summary
SELECT
    merchant_id,
    DATE(created_at) AS summary_date,
    COUNT(DISTINCT id) AS order_count,
    COUNT(DISTINCT user_id) AS user_count,
    SUM(amount) AS total_amount,
    AVG(amount) AS avg_amount,
    (SELECT SUM(quantity) FROM order_items WHERE order_id IN (
        SELECT id FROM orders WHERE merchant_id = o.merchant_id AND DATE(created_at) = DATE(o.created_at)
    )) AS total_items
FROM orders o
WHERE status = 1 AND created_at >= CURDATE()
GROUP BY merchant_id, DATE(created_at)
ON DUPLICATE KEY UPDATE
    order_count = VALUES(order_count),
    user_count = VALUES(user_count),
    total_amount = VALUES(total_amount),
    avg_amount = VALUES(avg_amount),
    total_items = VALUES(total_items);

-- Query the summary table directly
SELECT
    m.id,
    m.name,
    SUM(s.order_count) AS order_count,
    SUM(s.user_count) AS user_count,
    SUM(s.total_amount) AS total_amount,
    AVG(s.avg_amount) AS avg_amount,
    SUM(s.total_items) AS total_items
FROM merchants m
INNER JOIN merchant_order_summary s ON m.id = s.merchant_id
WHERE m.city = '北京' AND m.level = 3 AND s.summary_date >= DATE_SUB(CURDATE(), INTERVAL 30 DAY)
GROUP BY m.id, m.name
ORDER BY SUM(s.total_amount) DESC
LIMIT 100;

Result: final execution time 0.2 seconds (≈200 ms), achieving a 101.7× speedup over the original query.

4. Performance Comparison

Optimization Stage      Execution Time   Speedup   Key Optimizations
--------------------------------------------------------------------
Original SQL            20.34 s          -         Full table scans, no indexes
Add Indexes            8.50 s           2.4×      Basic index creation
SQL Rewrite            2.30 s           8.8×      Filter before join
Covering Index         0.80 s           25.4×     Avoided row look‑ups
Materialized View      0.20 s           101.7×    Pre‑aggregated summary table

5. General Optimization Methodology

5.1 Diagnosis Three‑Step Framework

EXPLAIN analysis – check type, key, and Extra fields.

Slow‑query log analysis – use pt‑query‑digest or similar tools.

Profile analysis – enable profiling and examine each stage.

5.2 Six‑Step Optimization Process

Index optimization – create appropriate indexes, follow left‑most prefix rule.

SQL rewrite – drive small tables, filter early, avoid SELECT *.

Table structure tuning – consider denormalization, proper data types, partitioning.

Cache strategy – use Redis or application‑level cache for hot data.

Read‑write separation – master‑slave replication, load balancing.

Data archiving – periodic archiving of historical data.

5.3 Index Design Principles

-- Good index design example
ALTER TABLE orders ADD INDEX idx_merchant_status_created (merchant_id, status, created_at);
-- Reasons:
-- 1. merchant_id used for JOIN
-- 2. status has high selectivity
-- 3. created_at is a range filter, placed last

Put equality columns first.

Put range columns later.

Consider ordering columns.

Higher selectivity first.

5.4 Common Pitfalls

Implicit type conversion

-- Wrong: comparing VARCHAR column with a number
WHERE phone = 13812345678  -- phone is VARCHAR
-- Correct:
WHERE phone = '13812345678'

Functions on indexed columns

-- Wrong: applying DATE() on indexed column
WHERE DATE(created_at) = '2024-03-15'
-- Correct:
WHERE created_at >= '2024-03-15' AND created_at < '2024-03-16'

OR conditions preventing index use

-- Bad:
WHERE merchant_id = 100 OR user_id = 200
-- Better (UNION):
SELECT * FROM orders WHERE merchant_id = 100
UNION
SELECT * FROM orders WHERE user_id = 200

6. Monitoring and Prevention

6.1 Monitoring Setup

-- Create a view for slow‑query monitoring
CREATE VIEW slow_query_monitor AS
SELECT
    DATE(start_time) AS query_date,
    LEFT(sql_text, 100) AS query_sample,
    COUNT(*) AS exec_count,
    AVG(query_time) AS avg_time,
    MAX(query_time) AS max_time,
    SUM(rows_examined) AS total_rows_examined
FROM mysql.slow_log
GROUP BY DATE(start_time), LEFT(sql_text, 100)
ORDER BY avg_time DESC;

6.2 Automated Alert Script (Python)

#!/usr/bin/env python3
import MySQLdb, smtplib
from email.mime.text import MIMEText

def check_slow_queries():
    db = MySQLdb.connect(host='localhost', user='monitor', passwd='password', db='mysql')
    cur = db.cursor()
    cur.execute('''
        SELECT COUNT(*) AS slow_count, AVG(query_time) AS avg_time
        FROM mysql.slow_log
        WHERE start_time >= DATE_SUB(NOW(), INTERVAL 1 HOUR)
    ''')
    slow_count, avg_time = cur.fetchone()
    if slow_count > 100 or avg_time > 5:
        send_alert(f"Slow query alert: count={slow_count}, avg_time={avg_time}s")
    cur.close()
    db.close()

def send_alert(message):
    msg = MIMEText(message)
    msg['Subject'] = 'MySQL Slow Query Alert'
    msg['From'] = '[email protected]'
    msg['To'] = '[email protected]'
    s = smtplib.SMTP('localhost')
    s.send_message(msg)
    s.quit()

if __name__ == '__main__':
    check_slow_queries()

6.3 Routine Optimization Checklist

Weekly

Analyze top‑10 slow queries.

Check index usage.

Assess data growth.

Monthly

Rebuild heavily fragmented tables.

Update statistics.

Remove unused indexes.

Quarterly

Evaluate architectural changes (sharding, partitioning).

Plan hardware upgrades.

7. Lessons Learned

7.1 Golden Rules

Measure before you optimize

Base‑line performance with real data.

Quantify each change.

80/20 principle

80 % of latency comes from 20 % of queries.

Prioritize the most frequent and costly queries.

Iterative optimization

Change one thing at a time.

Record impact and keep rollback plans.

7.2 Team Collaboration

Establish an SQL review process with a checklist:

-- SQL Review Checklist
- [ ] Appropriate indexes exist?
- [ ] Full table scan avoided?
- [ ] JOIN count ≤ 3?
- [ ] No SELECT *
- [ ] Subqueries can be rewritten as JOINs?
- [ ] Future data growth considered?

Define coding standards for naming, index conventions, and SQL style. Conduct regular knowledge‑sharing sessions and maintain an internal wiki of optimization cases.

7.3 Common Pitfalls

Over‑indexing – creating separate indexes for each column leads to write‑performance degradation. Use composite indexes instead.

-- Bad: three separate indexes
ALTER TABLE orders ADD INDEX idx_merchant (merchant_id);
ALTER TABLE orders ADD INDEX idx_status (status);
ALTER TABLE orders ADD INDEX idx_created (created_at);

-- Good: single composite index
ALTER TABLE orders ADD INDEX idx_merchant_status_created (merchant_id, status, created_at);

Ignoring write impact – more indexes increase insert/update latency; balance read and write needs.

Cache stampede – simultaneous cache expiration can overload DB. Randomize TTLs.

8. Final Thoughts

Reducing query time from 20 seconds to 200 milliseconds is more than a numeric achievement; it reflects a disciplined approach to performance engineering. Remember:

Optimization is continuous, not a one‑off task.

Monitoring beats reactive fixing.

Combine theory with hands‑on practice.

By applying systematic diagnosis, thoughtful indexing, query refactoring, and leveraging materialized views, you can turn sluggish queries into lightning‑fast responses and keep your systems running smoothly.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLindexingmysqlslow-query
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.