Design and Implementation of an SSD‑Based Application‑Layer Cache Architecture for Kafka in Meituan Data Platform
Meituan built an SSD‑based application‑layer cache for Kafka that bypasses PageCache contention between real‑time and delayed jobs, classifies log segments across SSD and HDD, limits flush rates, and achieves up to 80% latency reduction while guaranteeing stable real‑time consumption.
Kafka plays a critical role in Meituan's data platform as a unified data cache and distribution system. Because many real‑time jobs share the same PageCache, competition for PageCache resources can cause latency spikes for real‑time workloads when delayed jobs trigger unexpected disk reads.
The article introduces a self‑developed SSD‑based application‑layer cache architecture for Kafka that eliminates PageCache contention and improves service quality for real‑time consumption.
Current Kafka Situation
Meituan operates a massive Kafka deployment (6000+ nodes, 100+ clusters, >6 × 10⁴ topics, >4 × 10⁵ partitions) handling up to 8 × 10¹³ messages per day with peak throughput of 1.8 × 10⁸ msgs/s. Real‑time jobs and delayed jobs compete for PageCache, leading to degraded latency and throughput.
Pain‑Point Analysis & Goals
Real‑time and delayed jobs compete for PageCache, causing unexpected disk reads for real‑time jobs.
HDD performance drops sharply under high read concurrency.
About 20% of jobs experience delayed consumption.
Goal: guarantee that real‑time consumption is not affected by delayed consumption and provide stable QoS.
Why SSD?
Two directions were considered: (1) eliminate PageCache competition (e.g., avoid writing delayed data back to PageCache) and (2) introduce a faster device between HDD and memory. The second direction was chosen because PageCache policies are hard to modify and memory is costly. SSD offers orders‑of‑magnitude higher IOPS and bandwidth, making it suitable as an intermediate cache.
Architecture Decision
Two candidate solutions were evaluated:
Kernel‑level caching (FlashCache, OpenCAS, etc.) – limited by kernel version and not fully aligned with Kafka’s access patterns.
Application‑layer caching – integrates directly with Kafka’s read/write semantics.
The application‑layer approach was selected.
New Architecture Overview
LogSegments are classified by time into three states:
OnlyCache : stored exclusively on SSD.
Cached : synchronized from HDD to SSD (inactive segments).
WithoutCache : resides only on HDD.
Background threads periodically sync inactive segments to SSD, move aged segments back to HDD, and enforce space thresholds.
Read path: if a requested offset belongs to a Cached or OnlyCache segment, data is served from SSD; otherwise it falls back to HDD. Write path still writes to PageCache first, then flushes to SSD when thresholds are met.
Key Optimizations
LogSegment Synchronization
Two aspects are tuned: synchronization method (active vs. inactive) and rate‑limiting to avoid overwhelming HDD or SSD during bulk transfers.
Append Flush Strategy
Original Kafka flushes based on message count (e.g., 50 000 msgs). The new design limits flush rate to 2 MB/s per segment, smoothing I/O spikes. The flush call is performed via fileChannel.force.
Evaluation
Four clusters were built: new SSD‑cache architecture, plain HDD, FlashCache, and OpenCAS (each with three nodes). Tests measured write/read latency, HDD/SSD hit rates, and SLA impact on real‑time jobs under delayed‑consumption scenarios.
Results:
Before flush‑rate optimization, the SSD‑cache cluster consistently outperformed other solutions.
After optimization, advantages remained significant under high write loads (>170 MB/s).
Delayed jobs never impacted real‑time jobs in the SSD‑cache cluster.
Conclusion & Future Work
The SSD‑based application‑layer cache reduces read/write latency by up to 80% and isolates real‑time consumption from delayed consumption. The design is in gray‑scale rollout and will be contributed to the Kafka community. Future work includes broader deployment to high‑priority clusters.
Authors
Shi Ji and Shi Lu, engineers in Meituan's data platform.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Meituan Technology Team
Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
