Artificial Intelligence 34 min read

Optimizing Meituan Search Ranking with BERT: Methods and Practices

The Meituan Search team boosted ranking relevance by training a domain‑specific BERT, applying data augmentation, brand‑sample optimization, knowledge‑graph fusion, multi‑task and pairwise fine‑tuning, joint end‑to‑end training with LambdaLoss ranking models, and compressing the model for low‑latency inference, delivering up to +925 BP offline accuracy gains and measurable CTR and NDCG improvements in production.

Meituan Technology Team

Jul 9, 2020

Optimizing Meituan Search Ranking with BERT: Methods and Practices

Introduction Meituan Search is the main entry for various life services on the Meituan app. To improve deep semantic relevance between user queries and candidate documents, the Search & NLP team started applying BERT in late 2019 and achieved notable offline and online gains after three months of iteration.

BERT Overview BERT, built on the Transformer architecture, has become the backbone of many NLP tasks. The team trained a domain‑specific MT‑BERT on massive Meituan data and applied it to intent recognition, sentiment analysis, recommendation reasons, and fine‑grained classification.

Relevance Modeling Relevance is defined as the semantic match between a user query and a candidate document (usually a merchant or product). Traditional term‑based features (TF‑IDF, BM25) capture literal similarity but fail on synonyms and long‑tail queries. BERT provides stronger semantic features, enabling both feature‑based (vector similarity) and fine‑tune‑based (joint query‑doc encoding) approaches.

Algorithm Exploration

1. Data Augmentation – Weakly supervised click logs were filtered to remove noisy single‑character queries, brand‑only matches, and unexposed negatives using a “Skip‑Above” strategy.

2. Brand Sample Optimization – POI names were mapped to brand names to eliminate noise from location suffixes; negative samples were restricted to other brands.

3. Knowledge Fusion – Structured knowledge from Meituan’s “Meituan Brain” knowledge graph (categories, brands, tags) was concatenated to the document text and encoded with an additional segment embedding.

4. Multi‑Task Fine‑Tuning – Joint training of relevance classification and named entity recognition (NER) improved semantic alignment.

5. Pairwise Fine‑Tuning – Triplet samples (query, positive doc, negative doc) were used with a RankNet‑style loss to capture relative ordering, yielding the largest accuracy boost.

6. Joint Training with Ranking Model – A partition‑model framework combined BERT relevance features with the L2 ranking model (LambdaDNN, TransformerDNN, MultiTaskDNN) in an end‑to‑end fashion, optimizing NDCG via LambdaLoss.

Model Compression – To meet latency constraints, knowledge distillation reduced a 12‑layer MT‑BERT to a 2‑layer student model without significant performance loss. Additional techniques such as model pruning and low‑precision quantization were evaluated.

Online System Optimizations – The ranking service was refactored with Augur (feature‑stacking), Poker (offline training platform), and TF‑Serving with FasterTransformer for accelerated inference. Caching of head‑queries further reduced latency, achieving 50 QPS with only a 2 ms increase in TP99.

Results Offline benchmarks showed improvements of up to +925 BP in accuracy for pairwise fine‑tuning, while online A/B tests demonstrated consistent gains in CTR and NDCG across all optimization stages.

Future Work The team plans to (1) integrate more knowledge‑graph signals for long‑tail queries, (2) jointly optimize relevance with intent and category prediction, and (3) deepen end‑to‑end fusion of BERT relevance and ranking models.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning search ranking knowledge distillation BERT semantic relevance pairwise fine-tuning

Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.