Augur: An Online Model Inference Framework and Poker Platform for Meituan Search

Meituan’s AI‑driven search combines the Augur online inference framework—offering stateless, distributed feature operators, transformers, and a DSL for rapid, high‑throughput model scoring—with the Poker platform for model training, versioning, and experimentation, together accelerating iteration, improving performance, and enabling advanced model‑as‑feature ensembles.

Meituan Technology Team
Meituan Technology Team
Meituan Technology Team
Augur: An Online Model Inference Framework and Poker Platform for Meituan Search

Over the past decade, machine learning has made significant breakthroughs in both academia and industry. Meituan has been exploring various machine learning models for its search platform, ranging from linear and tree models to deep neural networks, BERT, and DQN. The two core components enabling AI-driven search at Meituan are the model training platform Poker and the online inference framework Augur .

1. Background Search optimization is a typical AI application problem that requires a system-level solution. After nearly ten years of technical accumulation, Meituan’s search architecture has evolved from a traditional retrieval engine to an AI‑powered search engine. The current architecture consists of three major subsystems: a search data platform, an online retrieval & cloud search platform, and an AI service & experimentation platform. Within the AI platform, Poker and Augur address the end‑to‑end workflow from offline model training to online serving, dramatically improving iteration speed, inference performance, and ranking stability.

2. What is model inference? From an engineering perspective, a model can be seen as a function f(x₁, x₂, …) = y. Inference consists of two simple steps: (1) feature extraction (identifying the inputs x₁, x₂, …) and (2) applying the model function to compute the output score y. While real‑world models may output vectors or matrices, the core idea remains a straightforward computation.

3. Changes in the inference framework The legacy inference framework suffered from three main limitations: performance bottlenecks when handling thousands of documents, difficulty reusing inference logic across many business scenarios, and lack of a management platform for rapid feature and model iteration. Augur was designed to overcome these issues by decoupling business logic from inference, providing a stateless, distributed architecture capable of handling large‑scale deep‑model inference.

4. Building the inference platform

Augur introduces two key abstractions:

Operator (OP) : generic feature processing logic, divided into IO OPs (e.g., fetching raw data from KV stores) and Calc OPs (e.g., parsing JSON, performing arithmetic).

Transformer : model‑specific feature transformations such as bucketing or low‑frequency filtering.

These abstractions enable asynchronous I/O, automatic RPC aggregation, and parallel computation, greatly improving performance. Augur also provides a weak‑typed, human‑readable feature expression language (DSL) built on Bison & JFlex. An example of the DSL is shown below:

// IO Feature: binaryBusinessTime; ReadKV is an IO OP
ReadKV('mtptpoionlinefeatureexp','_id',_id,'ba_search.platform_poi_business_hour_new.binarybusinesstime','STRING')
// FeatureA: CtxDateInfo; ParseJSON is a Calc OP
ParseJSON(_ctx['dateInfo']);
// FeatureB: isTodayWeekend uses the parsed date to check weekend
IsWeekend(CtxDateInfo['date'])

Features can also be expressed via a JSON configuration that combines a feature name with a list of Transformers. A representative JSON snippet is:

{
  "tf_input_config": {"otherconfig"},
  "tf_input_name": "modulestyle",
  "name": "moduleStyle",
  "transforms": [
    {
      "name": "BUCKETIZE",
      "params": {"bins": [0,1,2,3,4]}
    }
  ],
  "default_value": -1
}

Performance benchmarks show that parsing a two‑level expression 100,000 times costs only about 1.6 ms, which is negligible compared to the overall inference latency.

_I(_I('xxx'))
Benchmark              Mode  Cnt  Score   Error  Units
AbsBenchmarkTest.test  avgt   25  1.644 ± 0.009 ms/op

Augur’s inference pipeline distinguishes between ContextLevel features (global to a request) and DocLevel features (specific to each document). The system shards document lists, performs feature loading and scoring in parallel, and aggregates results, achieving >100 % capacity improvement on a modest 16 CPU + 16 GB machine.

5. Platform construction (Poker) To maximize the value of Augur, Meituan built the Poker platform, a product‑grade management system for models, features, and experiments. Poker offers capabilities such as feature creation, testing, online rollout, version control, risk alerts, and model configuration (including gray‑release, validation, and debugging). It also provides rich APIs for model serving, feature extraction, and monitoring.

6. Advanced usage Augur supports “Model as a Feature” where the output of one model becomes an input feature for another model (e.g., BERT‑as‑Feature). Two specialized OPs— LocalModelFeature for same‑dimension stacking and RemoteModelFeature for cross‑dimension stacking—enable efficient implementation of this pattern. Augur also facilitates online model ensembles by allowing multiple models to score a document and combine their scores via configurable weights.

7. Future outlook The team plans to extend Augur to higher‑throughput ranking stages, further integrate it with the Poker experimentation platform, and continue improving performance, scalability, and usability.

For readers interested in trying Augur, recruitment information is provided at the end of the original article.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

machine learningfeature engineeringsearch engineonline inferenceAI PlatformModel Serving
Meituan Technology Team
Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.