Databases 23 min read

How Meituan Optimizes MySQL Slow Queries with Cost‑Based Index Recommendations

This article explains how Meituan tackles MySQL slow‑query problems by leveraging the database cost‑based optimizer to generate index suggestions, evaluate their quality, and operate a full‑stack governance workflow that includes data collection, statistical modeling, validation, and continuous improvement.

dbaplus Community
dbaplus Community
dbaplus Community
How Meituan Optimizes MySQL Slow Queries with Cost‑Based Index Recommendations

Background

Slow queries in MySQL consume excessive CPU and I/O, often causing business outages; Meituan processes over a hundred million slow queries daily, with more than 10% of database failures linked to them.

Cost‑Based Optimizer Overview

SQL Execution Flow

The MySQL execution pipeline includes parsing, logical transformation, cost preparation, cost‑based optimization, additional rewrites, and plan execution.

Cost Model

Costs are split into server‑side CPU cost and engine‑side I/O cost. Default MySQL 5.7 parameters include io_block_read_cost=1, row_evaluate_cost=0.2, and others such as memory_temptable_create_cost, key_compare_cost, etc.

Cost‑Based Index Selection

Using the same cost model as the optimizer, Meituan evaluates all possible index combinations for a given SQL. Example SQL:

SELECT * FROM sync_test1 WHERE name LIKE 'Bobby%' AND dt > '2021-07-06';

Four candidate indexes (IX_name, IX_dt, IX_dt_name, IX_name_dt) are created, and EXPLAIN shows the optimizer choosing IX_name with the lowest cost (687.41 vs higher costs for other indexes).

+----+-------------+------------+-------+-------------------------------------+--------+---------+------+--------+------------------------------------+
| id | select_type | table      | type  | possible_keys                       | key    | key_len | ref  | rows   | Extra                              |
+----+-------------+------------+-------+-------------------------------------+--------+---------+------+--------+------------------------------------+
|  1 | SIMPLE      | sync_test1 | range | IX_name,IX_dt,IX_dt_name,IX_name_dt | IX_name| 12      | NULL | 572    | Using index condition; Using where |
+----+-------------+------------+-------+-------------------------------------+--------+---------+------+--------+------------------------------------+

Tracing the optimizer confirms that other indexes are rejected because their range‑scan cost exceeds that of IX_name.

Index Recommendation Architecture

Meituan splits the system into a lightweight Fakeindex storage engine (C++) and a Go‑based recommendation service. The engine forwards index‑related calls to the Go server via cgo‑based RPC, avoiding heavyweight frameworks.

The Go server performs the core logic:

Pre‑validation to filter unsupported statements.

Extract key columns from WHERE, JOIN, ORDER BY, GROUP BY, and aggregation functions.

Generate candidate index permutations, then prune existing indexes, overly long indexes, and unsupported types.

Collect metadata (table definition, existing indexes), statistics (row count, data size, cardinality), and sample data using block‑based sampling to avoid heavy I/O.

Compute statistical metrics such as records_in_range and cardinality using sample scaling and slope‑based adjustments.

Evaluate candidate index costs by creating a fake table with those indexes in the Fakeindex engine, running EXPLAIN FORMAT=JSON, and selecting the lowest‑cost plan.

Because MySQL limits a table to 64 secondary indexes, Meituan adopts a merge‑batch strategy to handle thousands of candidates, supporting up to 4096 candidates and 17 columns.

Recommendation Quality Assurance

Testing on 246 GB of recent slow‑query logs (≈30 k SQL templates) shows high coverage, though occasional invalid recommendations arise due to sample bias, system bugs, or optimizer limitations.

To mitigate this, Meituan adds two post‑recommendation steps:

Effectiveness Validation: Run the recommended index on a sampled replica and verify cost reduction via EXPLAIN.

Effect Tracking: After deployment, monitor real‑time query execution with Flink; if performance regresses, alert DBAs immediately.

Simulation Environment & Test Case Library

A offline simulation environment reproduces production data and the full recommendation pipeline, allowing safe debugging of failures. Test cases are continuously harvested from production slow‑query logs, incident tickets, and manually crafted edge cases, forming a growing repository.

Slow‑Query Governance Operations

Meituan classifies slow queries into three time‑based groups:

Historical: Long‑standing slow queries, managed via periodic reports.

New: Recently appearing queries, prioritized by impact and fed into real‑time alerts.

Potential: Queries not yet slow but likely to become so as data grows; proactive recommendations are generated.

Each group follows a tailored workflow, from automatic index suggestion to manual DBA approval.

Project Status & Future Plans

To date, the system has analyzed >6 k slow queries, issued >1 k index recommendations, and is gradually rolled out to production. Future work includes scaling to handle billions of daily slow queries, incorporating index‑maintenance cost (disk usage, write overhead), and moving from per‑SQL to workload‑wide global optimization similar to Alibaba Cloud DAS.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

mysqlDatabase Performanceslow-querycost‑based optimizerIndex Recommendation
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.