Databases 13 min read

How to Tackle MySQL Slow Queries: Metrics, Strategies, and Real Cases

This article explains what constitutes a MySQL slow query, why they cause failures, defines quantitative metrics such as micro‑average and macro‑average to assess severity, outlines target goals, presents concrete optimization examples, and shares operational practices for ongoing slow‑SQL governance.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How to Tackle MySQL Slow Queries: Metrics, Strategies, and Real Cases

Why Address Slow SQL?

What Is a Slow SQL?

In MySQL, a slow query is any statement whose execution time exceeds the long_query_time threshold.

MySQL records such statements in the slow‑query log (slowlog), which captures not only SELECTs but also INSERT, UPDATE, and other DML statements that surpass the threshold.

How to Check Slow‑Query Settings

# Check if slow query logging is enabled
show variables like "slow_query_log%";

# Check the threshold (seconds)
show variables like "long_query_time";

In AliSQL‑X‑Cluster (XDB), slow‑query logging is enabled by default with long_query_time set to 1 second.

Why Slow Queries Cause Failures

Real slow SQL often triggers massive row scans, temporary file sorting, or frequent disk flushes, raising disk I/O and turning normal queries into slow ones, leading to widespread timeouts.

After the 2022 Double‑11 incident, the CTO‑D team launched a dedicated slow‑SQL remediation project, with my team as the sponsor.

What Level Should Be Reached?

Measuring Severity

Micro‑average

sum(app_slow_sql_executions) / sum(app_total_sql_executions)

A higher value indicates greater impact; a value of 1 means every execution is slow, 0 means none are.

However, this metric can be diluted for high‑QPS apps where occasional slow queries are hidden.

Macro‑average

(sum(slow_sql1_executions) + … + sum(slow_sql_n_executions)) / (sum(sql1_executions) + … + sum(sql_n_executions))

This reduces the influence of rare, false‑positive slow queries but may be biased for low‑QPS applications.

Execution Count sum(app_slow_sql_executions) / 7 Using the average daily slow‑SQL executions over the past week mitigates macro‑average distortion.

Slow‑SQL Template Count

count(distinct(app_slow_sql_template))

Targets

Core applications: eliminate all slow SQL.

Regular applications: reduce the micro‑average metric by 50%.

CTO Report

Metrics are aggregated per CTO‑D line, weighted, and ranked from low to high. The top‑3 and bottom‑3 are highlighted in a weekly report.

Actionable Guidelines

MySQL Standards

Prohibit joins on more than three tables. Ensure join fields have identical data types and are indexed.

When indexing VARCHAR columns, specify an appropriate prefix length based on field selectivity (e.g., a 20‑character prefix often yields >90% selectivity).

Avoid left‑most or full wildcard searches; use a search engine for such cases.

Prevent implicit type conversion that can invalidate indexes.

Do not over‑index nor under‑index; balance index coverage with update overhead.

DB Change Standards

DDL changes must be paced, respect gray‑release windows, and control concurrency.

Real‑World Optimization Cases

Uneven Data Distribution

Eight databases with sixteen tables each were found to concentrate data in only two tables per database, indicating a flawed sharding strategy.

Index Issues

A composite index idx_logistics_corp_id_special_id had low selectivity. Switching to a more selective combination (logistics_corp_id, transport_type_id) improved performance.

Problematic Query Example

SELECT COUNT(0) AS `tmp_count`
FROM (
  SELECT `table_holder`.`user_id`,
         `table_holder`.`sc_item_id`,
         SUM(CASE `table_holder`.`inventory_type`
               WHEN 1 THEN `table_holder`.`quantity`
               ELSE 0 END) AS `saleable_quantity`,
         SUM(CASE `table_holder`.`inventory_type`
               WHEN 1 THEN `table_holder`.`lock_quantity`
               ELSE 0 END) AS `saleable_lock_quantity`,
         SUM(CASE `table_holder`.`inventory_type`
               WHEN 401 THEN `table_holder`.`quantity`
               ELSE 0 END) AS `transfer_on_way_quantity`,
         `table_holder`.`store_code`,
         MAX(`table_holder`.`gmt_modified`) AS `gmt_modified`
  FROM `table_holder`
  WHERE `table_holder`.`is_deleted` = 0
    AND `table_holder`.`quantity` > 0
    AND `table_holder`.`user_id` IN (3405569954)
    AND `table_holder`.`store_code` IN ('ZJJHBHYTJJ0001', '...1000+ others')
  GROUP BY `table_holder`.`user_id`, `table_holder`.`sc_item_id`
  ORDER BY `table_holder`.`user_id` ASC, `table_holder`.`sc_item_id` ASC
) `a`;

The table had a store_code index, yet the optimizer chose a full table scan because the IN list exceeded 200 values, causing the index to be ignored.

Adjusting the query to use a composite index on (is_deleted, quantity) resolved the issue.

External Influence

High‑selectivity index fields were still causing scans when many instances shared the same physical host, leading to disk I/O contention and temporary file generation.

Multiple MySQL leader nodes co‑located on the same machine shared disk buffers, causing occasional slow‑SQL spikes that could not be eliminated solely by indexing.

Unsolvable Cases

Some scenarios, such as overly complex multi‑field queries without mandatory fields, cannot be efficiently handled by indexes alone and may require a dedicated search engine.

Daily Operations

Across CTO‑D lines, metrics have dramatically improved: core apps have cleared all slow queries, and our team moved from the bottom‑3 to the top‑3 in weekly rankings.

We now run a weekly slow‑SQL ticket pipeline: owners receive assignments, resolve issues, and close tickets, establishing a routine of zero‑slow‑SQL weeks.

Conclusion

This retrospective outlines the strategy formulation, problem analysis, and resolution process for MySQL slow‑SQL governance, offering valuable lessons for similar large‑scale database operations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

metricsmysqlIndex OptimizationDatabase Performanceslow-query
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.