How BMQ’s Cloud‑Native Compute‑Storage Separation Revolutionizes Message Queues
This article explains how ByteDance’s BMQ, a cloud‑native message engine with a compute‑storage separated architecture, overcomes Kafka’s scalability and operational limits by using Proxy, Broker, Coordinator, and Controller modules, a distributed storage model, and advanced caching to achieve rapid scaling, high throughput, and resilient operations.
BMQ Architecture Overview
Driven by rapid internal business growth, the limitations of classic Kafka—elasticity, scale, cost, and operations—became apparent, prompting ByteDance’s messaging team to develop BMQ, a cloud‑native message engine with compute‑storage separation. BMQ consists of four core modules: Proxy, Broker, Coordinator, and Controller.
Module Functions
Proxy : receives all client requests, forwards production requests to the appropriate Broker, forwards consumer‑related requests (e.g., commit offset, join group) to the Coordinator, and directly handles read requests.
Broker : primarily handles write requests; other request types are processed by Proxy and Coordinator.
Coordinator : isolated from the Broker as a separate process, providing complete isolation of read/write traffic and consumer coordination, and can be scaled independently.
Controller : manages heartbeats, load balancing, fault detection, and control commands. Because data resides in a distributed storage system, BMQ does not need to manage replicas, allowing the Controller to focus on cluster‑wide traffic balancing and fault detection.
Advantages of the Layered Architecture
All user requests enter through Proxy, so BMQ’s metadata lists Proxy information as the ‘Broker’. Clients send production and consumption requests to Proxy, which then processes or forwards them. This design enables advanced fault tolerance, such as back‑off retries during Broker restarts, monitoring Proxy errors, dynamic fault diagnosis, and automatic isolation of faulty nodes.
Data Storage Model
In BMQ, each partition is split into segments stored across a distributed storage pool rather than a single local disk. This pool‑based model mitigates hotspot issues, as segments are evenly distributed across many disks, reducing contention during data back‑tracking and improving overall throughput.
Operations and Fault Impact
The distributed storage model simplifies scaling operations: adding storage nodes provides immediate read/write capacity without data copying, and replacement or shrinkage incurs minimal impact because I/O is spread across many nodes. Faults affecting two storage nodes do not block new writes; only segments residing on failed nodes become unavailable, and historical data may remain unaffected.
Challenges of the Layered Architecture
Access latency increases because data resides in a remote storage system, and the system must handle metadata pressure on the storage cluster. BMQ addresses these challenges in both production and consumption paths.
Production Path
Data writes are handled by Brokers, with the Controller assigning partitions based on load. Each partition recovers from the latest checkpoint, creates new segments, and flushes data to the distributed storage via an Inflight Buffer. If a flush times out, BMQ creates a new segment and records only successfully flushed data, avoiding duplicate reads—this is the “Failover” mechanism.
The underlying storage system (Volcano Engine) provides high‑performance C++‑implemented distributed file services, supporting up to 50 k QPS writes, 150 k QPS reads, write latency p99 ≈ 10 ms, and read latency p99 in sub‑millisecond range.
Consumption Path
To avoid overwhelming the storage metadata service, Proxy employs two caches:
Message Cache : stores recent message data in memory, allowing multiple consumer groups to share a single storage read.
File Cache : caches file handles for segments, reducing metadata requests by up to 70% for sequential consumption patterns.
Additional mechanisms detect slow storage nodes and switch reads to healthier nodes, leveraging NVMe‑tiered storage for low‑latency consumption.
Summary and Outlook
The layered architecture delivers significant performance gains and operational benefits for BMQ, supporting terabytes‑per‑second ingress and peak throughput of hundreds of GB/s. Future work includes merging Proxy and Broker for lower deployment cost and latency, enhancing automated fault detection, and improving elastic scaling to further reduce costs while maintaining high tenant throughput.
BMQ, as a fully managed cloud‑native service, enables dynamic scaling and unified stream‑batch processing, serving as a central “nervous system” for real‑time log collection, data aggregation, and offline analytics.
Volcano Engine Developer Services
The Volcano Engine Developer Community, Volcano Engine's TOD community, connects the platform with developers, offering cutting-edge tech content and diverse events, nurturing a vibrant developer culture, and co-building an open-source ecosystem.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
