Backend Development 23 min read

Online Feature System: Architecture, Storage, and High‑Concurrency Techniques

Using Meituan’s hotel‑travel platform as a case study, the article details a scalable online feature system architecture that combines layered storage, efficient compression, and robust synchronization to meet extreme concurrency, throughput, terabyte‑scale data, and sub‑10 ms latency demands for AI‑driven strategy services.

Meituan Technology Team

Jul 6, 2017

Online Feature System: Architecture, Storage, and High‑Concurrency Techniques

In modern Internet products, strategy systems such as advertising, search, recommendation, routing, driver dispatch, and intelligent design rely heavily on online feature services built on AI techniques. Each strategy system needs massive online features to support model inference or rule‑based decisions, making the feature system a critical backbone.

The article uses Meituan’s online feature system for hotels and travel as a case study and focuses on practical technical points for high‑concurrency scenarios.

1.1 Integrated Feature System Framework

The online feature system provides a KV‑style service for feature queries and can be extended to cover feature production, scheduling, and real‑time monitoring. Its architecture consists of four parts:

Data Sources: Hive, MySQL, Kafka, etc.

Feature Production: Reads from sources and computes features using multiple frameworks.

Feature Import: Writes computed features to online storage, handling dependencies, write speed, and consistency.

Feature Service: Core component that serves features to upstream strategy systems.

The feature lifecycle can be abstracted as five steps: read → compute → write → store → retrieve.

1.2 Core of Feature System – Store and Retrieve

The service behaves like a large HashMap but must meet four challenges:

High Concurrency: Peak QPS > 10 k for strategy systems, > 1 M for databases.

High Throughput: Thousands of dimensions per request, network I/O up to 1.5 Gbps.

Big Data: Feature rows exceed 1 billion, total size at TB level.

Low Latency: 99th‑percentile latency < 10 ms.

Solutions must balance these goals according to business priorities.

2.1 Data Layering

When feature data reaches TB scale, a single storage medium cannot satisfy all requirements. Hot data is kept in memory or cache; warm data uses distributed KV stores (Redis, Memcached, HBase, Tair); cold data relies on disk‑backed stores. Layering allows each engine to serve the same logical feature API while optimizing performance and cost.

2.2 Data Compression

Compression reduces memory and bandwidth consumption. Three storage formats are discussed:

JSON (feature name‑value pairs as strings).

Metadata extraction (store names separately, values as strings).

Metadata solidification (strongly typed values).

In practice, metadata extraction yields 2‑10× space savings for homogeneous feature rows.

Byte‑level compression algorithms (Gzip, Snappy, Deflate, LZ4) were benchmarked on two real‑world feature datasets (≈100 k records, 300‑400 bytes each). Results show that Snappy offers the best trade‑off for the required fast decompression and reasonable compression ratio.

2.3 Data Synchronization

Two main techniques are used to keep online copies up‑to‑date:

2.3.1 Memory‑Copy Synchronization

Full copies of feature data are kept locally. Synchronization follows a push‑pull model. Push (via message queues) improves timeliness but still needs pull for initial bulk sync. Pull protocols compute diffs based on version numbers (RedoLog) or Merkle Trees.

Version‑based sync assigns an increasing version to each change, allowing a replica to request all updates after its last known version. To bound the update log size, merges combine adjacent entries, trading exactness for log length.

Merkle‑Tree sync hashes data blocks hierarchically; mismatched hashes trigger recursive comparison until differing leaf nodes are identified, achieving O(log N) round‑trips.

2.3.2 Client‑Side Cache

When data cannot fit entirely in memory, client caches store hot subsets. A unified cache interface separates caching logic from business code, supporting both synchronous and asynchronous APIs.

Cache eviction can conflict with JVM garbage collection, causing frequent Full GCs under high QPS. Strategies to mitigate this include off‑heap storage (e.g., Ehcache, BigMemory) or custom JVM tuning.

Conclusion

The article presents practical techniques for building a scalable online feature system that satisfies high concurrency, high throughput, big data, and low latency requirements. It outlines system architecture, data layering, compression choices, and synchronization mechanisms, providing a reference for architects facing similar challenges.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data High concurrency data compression real-time synchronization Distributed storage online feature system

Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.