Databases 16 min read

AI-Driven Index Recommendation for Slow Queries at Meituan

This article details a joint research effort between Meituan and East China Normal University that combines cost‑based methods with AI‑driven, data‑centric models to automatically generate and evaluate missing indexes for billions of daily slow queries, improving recommendation accuracy and query performance.

Meituan Technology Team

Apr 6, 2023

AI-Driven Index Recommendation for Slow Queries at Meituan

Background

Meituan processes over a hundred million slow queries per day, making manual index creation infeasible. The existing Database Autonomy Service (DAS) platform uses a cost‑based optimizer to suggest indexes, but the optimizer’s cost estimates are imperfect and the lack of built‑in fake‑index support in MySQL makes large‑scale index testing costly.

To address these issues, Meituan partnered with East China Normal University to integrate an AI‑plus‑data‑driven index recommendation approach alongside the cost‑based method.

Index Recommendation Overview

Index recommendation can be performed at the workload level (optimizing a set of queries under storage constraints) or at the query level (suggesting missing indexes for a single slow query).

2.1 Cost‑Based Index Recommendation

Cost‑based techniques focus on workload‑level optimization, treating every column or column combination as a candidate index, which creates a massive search space. The problem is NP‑hard, so greedy heuristics are commonly used, potentially yielding sub‑optimal index sets.

2.2 AI‑Plus‑Data‑Driven Index Recommendation

The AI approach targets query‑level recommendations. It treats index suggestion as a binary classification problem, training XGBoost models to predict whether adding an index on specific columns will improve execution time. Models are built for single‑column, two‑column, and three‑column indexes. For a given query, all candidate indexes are generated, features are extracted, and the appropriate model predicts the effectiveness of each candidate.

Overall Architecture

The system consists of two main components: model training and model deployment.

3.1 Model Training

Daily cost‑based recommendations from DAS provide labeled training data (queries and validated effective indexes). Candidate indexes are generated, feature vectors are constructed via feature engineering, and labels are assigned based on whether the index was used in the query. Separate training sets are created for single‑, double‑, and triple‑column models.

3.2 Model Deployment

For incoming slow queries, candidate indexes are generated and transformed into feature vectors. The corresponding classification model predicts the probability of each candidate being effective. The highest‑probability single‑, double‑, and triple‑column candidates are selected, built on a sampled replica of the production database, and evaluated for actual performance gain before being presented to users.

Modeling Process

4.1 Generating Candidate Indexes

Columns appearing in aggregation, WHERE, JOIN, ORDER BY, or GROUP BY clauses become single‑column candidates. These are combined to form two‑ and three‑column candidates, excluding any that duplicate existing indexes or violate the left‑most prefix rule.

4.2 Feature Engineering

Each candidate’s feature vector includes statement features (one‑hot encoding of column positions) and statistical features (table row count, cardinality, selectivity, etc.). For multi‑column candidates, features are concatenated and shared statistics (e.g., combined cardinality) are added.

4.3 Modeling Example

For a query joining customer and warehouse, four single‑column candidates from customer and one from warehouse are generated. The used index Idx(w_id) marks its candidate as a positive sample; the rest are negative. Since the query only uses a three‑column index Idx(c_d_id, c_id, c_last), two‑column candidates are omitted to avoid class imbalance. The final training set contributes five samples (one positive, four negative) for the single‑column model and six samples (one positive, five negative) for the three‑column model.

4.4 Prediction and Evaluation

All candidate indexes for a slow query are scored by their respective models. The top‑scoring single, double, and triple candidates are built on a sampled replica and their execution times are measured. Only indexes that demonstrably reduce query latency are recommended to the user, preventing unnecessary storage and update overhead.

Project Results

In March, the AI models contributed an additional 12.16% of index recommendations that were adopted by users, beyond those generated by the cost‑based method. These extra indexes improved roughly 5.2 billion query executions and saved about 4,632 hours of execution time.

Future Plans

The team intends to explore fine‑tuning open‑source large language models (e.g., Google’s T5) to directly generate index suggestions from a slow query text. When index recommendation fails to improve performance, the system will also provide textual SQL tuning advice such as column pruning or replacing sub‑queries with joins.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning feature engineering AI MySQL XGBoost slow query optimization Cost Model Index Recommendation

Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.