How to Optimize Massive MySQL Tables: Index Tuning, Fast Deletes, and Efficient DDL
This article analyzes a large MySQL table that caused SLA alerts due to master‑slave lag, examines slow‑query logs with pt‑query‑digest, identifies index and delete inefficiencies, proposes index redesign and batch‑delete strategies, and validates the improvements with testing and DDL tools.
Background
The XX instance (primary‑replica) generated daily SLA alerts indicating master‑slave lag, caused by long‑running master‑slave switches. The instance also accumulated many slow queries (execution time >1 s) due to a nightly job deleting data older than one month.
Analysis
We used pt-query-digest to analyze the last week’s mysql-slow.log. The first part of the report shows total slow‑query time 25 403 s, longest query 266 s, average 5 s, and 17.66 M rows scanned.
The second part lists the most frequent slow statements, e.g. select arrival_record … executed over 40 k times with average 4 s, and delete arrival_record … executed 6 times with average 258 s.
Sample SELECT
select count(*) from arrival_record where product_id=26 and receive_time between '2019-03-25 14:00:00' and '2019-03-25 15:00:00' and receive_spend_ms>=0\GThe query scans up to 56 M rows; the execution plan shows it uses the composite index IXFK_arrival_record but only the first column product_id is selective, causing large row scans.
We observed that the table has about 100 M rows and only one composite index (product_id, station_no, sequence, receive_time, arrival_time). The product_id column has low cardinality, while receive_time has high cardinality.
Optimization suggestions:
Drop the existing composite index IXFK_arrival_record.
Create a new composite index idx_sequence_station_no_product_id (product_id, sequence, station_no).
Create a single‑column index idx_receive_time on receive_time.
DELETE Statement
The delete operation scans about 110 M rows, taking ~262 s because it performs a full table scan (no usable index).
After adding idx_receive_time, the DELETE on receive_time reduces scanned rows dramatically, but still takes ~77 s for 300 M rows.
Testing
We copied the arrival_record table (≈48 GB on disk, 31 GB in MySQL) to a test instance, performed backup with mydumper, restored with myloader, and compared online DDL vs pt‑osc. Online DDL completed in 34 min, pt‑osc in 57 min.
We also tried small‑batch deletes using primary‑key ranges (LIMIT 20 000) and observed that deleting 10 minutes of data per batch takes ~1 s, eliminating SLA alerts.
Conclusion
When a table grows large, both query latency and maintenance cost (DDL time, delete time) must be considered. Proper indexing—especially on high‑cardinality columns—and batch‑delete strategies dramatically improve performance and reduce master‑slave lag.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
