Databases 15 min read

How to Slash MySQL Slow Queries on a 100M‑Row Table: Index Tuning and Batch Deletion

The article walks through a real‑world MySQL performance case where a 100‑million‑row table caused SLA alerts, analyzes slow‑query logs, demonstrates index redesign, compares online DDL with pt‑osc, and shows how batch deletions by primary key dramatically reduce delete time and replication lag.

Architect

Apr 29, 2024

How to Slash MySQL Slow Queries on a 100M‑Row Table: Index Tuning and Batch Deletion

Background

When the author joined a new company, a primary‑replica MySQL instance (one master, one slave) started generating SLA alerts at midnight because the replica lag could become large during master‑to‑slave failover.

Investigation showed that the slow‑query log contained many queries that scanned tens of millions of rows, especially select count(*) from arrival_record … and a daily delete from arrival_record where receive_time < … that each took hundreds of seconds.

Analysis

Using pt‑query‑digest --since=148h mysql‑slow.log the author measured total slow‑query time of 25 403 s in the last week, with the longest query taking 266 s, an average of 5 s per slow query, and an average scanned row count of 17.66 M.

The select arrival_record query scanned up to 56 M rows (average 1.72 M) because the composite index

IXFK_arrival_record(product_id,station_no,sequence,receive_time,arrival_time)

could only use its leftmost column product_id, whose cardinality is low, so the optimizer performed a full index scan.

Explain output showed type: ref, rows: 32261320, and Extra: Using index condition; Using where. The show index output confirmed that only one composite index existed and that product_id had a cardinality of 1 344, far too small to be selective.

The author concluded that a separate index on receive_time would let the query use a more selective range scan.

Testing

The table contains about 112 M rows (≈48 GB on disk, 31 GB in InnoDB) and suffers from fragmentation caused by previous large‑scale deletions.

Backup was performed with mydumper (32 parallel threads, 2 M rows per chunk) producing a 1.2 GB compressed dump in 52 s. The dump was copied to a test node and re‑imported with myloader, taking 126 m 42 s.

Two DDL approaches were compared on the test instance: MySQL’s native online DDL and the pt‑osc tool. Online DDL completed in 34 minutes, while pt‑osc took 57 minutes, making online DDL roughly 40 % faster.

Implementation

On the replica the author dropped the original composite index and created a new composite index

idx_product_id_sequence_station_no(product_id,sequence,station_no)

plus a single‑column index idx_receive_time(receive_time). The DDL script also removed the foreign key, performed the index changes, and re‑added the foreign key after the operation.

After the change, explain for the same select showed type: range, key: idx_receive_time, and rows reduced to 7.5 M, confirming the index was used.

Index‑Optimized Delete

Even after adding idx_receive_time, the daily delete still took 77 s because it scanned 110 M rows. The author therefore switched to batch deletion by primary key:

# Get the maximum id to delete
SELECT MAX(id) INTO @need_delete_max_id FROM arrival_record WHERE receive_time < '2019-03-01';
# Delete in small chunks
DELETE FROM arrival_record WHERE id < @need_delete_max_id LIMIT 20000;
SELECT ROW_COUNT();  # returns 20000
# Loop until ROW_COUNT() = 0

This approach reduced the impact on the master and eliminated the SLA alerts.

Summary

When a table grows beyond tens of millions of rows, both query latency and maintenance cost (DDL time, delete time) must be considered.

Choose the appropriate DDL method based on table size, foreign‑key constraints, and required downtime.

For massive deletes, use small‑batch primary‑key deletes to lower load and avoid replication lag.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

mysql Index Optimization Online DDL Slow Query Large Table Batch Delete

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.