Artificial Intelligence 12 min read

A General Feature Production Framework for Meituan Delivery Ranking System

The paper presents a generic feature‑production framework for Meituan’s food‑delivery ranking system that abstracts statistical feature generation, storage, retrieval, and online loading into configurable dimensions, metrics and operators, enabling developers to add new features with minimal code and dramatically speeding up machine‑learning model iteration.

Meituan Technology Team

Dec 9, 2016

A General Feature Production Framework for Meituan Delivery Ranking System

The article introduces a generic feature production framework designed to improve the iteration efficiency of Meituan's food delivery ranking system, which is driven by machine‑learning models (GBDT). Feature engineering is identified as the bottleneck for rapid model updates.

Feature statistics are the foundation: offline pipelines compute various statistical features (e.g., merchant sales, user category preferences) from raw exposure, click, and order tables stored in Hive. The framework abstracts the statistical pattern into three dimensions—statistical object, statistical dimension, and metric—allowing flexible configuration and time‑decay weighting.

Four main steps constitute the feature production pipeline:

Feature statistics (ETL or Spark‑based aggregation).

Feature push (mapping Hive rows to Domain objects, serializing them, and storing into a KV store).

Feature fetch (online service retrieves serialized data from KV and deserializes).

Feature load (online feature operators derive high‑level features from the raw ones).

The Spark‑based statistical engine supports custom dimension and metric operators, aggregation functions (sum, average, concat, ratio, quantile), and time‑decay weighting. Configuration files (Toml) define objects, dimensions, metrics, and operators, enabling rapid addition of new features with minimal code.

For feature synchronization, an ORM layer maps Hive rows to Java Domain objects, which are then serialized (JSON or Protostuff) and stored in KV with a prefixed key. A unified KvService handles serialization, deserialization, and KV read/write.

Online feature loading uses FeatureOperator annotations to declare required offline features ( @Fetchers) and produced online features ( @Features). A DataPortal caches fetched Domain objects, and a mapping Model → Feature → FeatureOperator → DataFetcher ensures efficient data reuse across multiple models.

In summary, by abstracting each stage of feature production, the proposed framework allows developers to add new features by defining a Domain class and a few configuration entries, dramatically reducing development effort and accelerating model iteration cycles.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning feature engineering KV store

Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.