Design Principles of RocketMQ: Broker Architecture, Persistence, High Performance and High Availability
The article explains how RocketMQ tackles growing business traffic by introducing an asynchronous broker layer, using commitlog and consumeQueue files, page‑cache, mmap, topic/tag routing, and a nameserver to achieve high‑throughput, low‑latency, and highly available message delivery.
The Happiness Dilemma
Zhang Dapeng is both excited and worried: the business volume has surged, turning previously trivial problems—such as new‑user registration that only required an SMS—into major issues that now need push notifications, coupons, and other activities.
Each registration now calls many services; assuming each call costs 50 ms, the total latency reaches 200 ms, and additional features (e.g., sending a newcomer red‑packet) would increase both time and integration effort. Zhang asks CTO Bill for a one‑stop solution.
Bill quickly identifies three core problems in the current system:
同步 (synchrony): the registration flow waits for downstream services, which is the main cause of high latency.
耦合 (tight coupling): the registration code must embed and redeploy code for every dependent module, making the whole flow fail if any secondary service fails.
流量暴增时系统被压垮的风险 (risk of overload during traffic spikes): a sudden surge (e.g., a red‑packet promotion) can overwhelm the long registration pipeline.
Bill suggests adding an intermediate layer—a queue—so that the registration event is placed into the queue and other modules consume it asynchronously.
This is a classic producer‑consumer model. By returning immediately after enqueuing the event, the system turns synchronous calls into asynchronous ones, decouples registration from other services, and smooths traffic spikes ("削峰"). The total latency drops from ~200 ms to ~55 ms, improving throughput by nearly four times.
Bill then asks which queue implementation to use. Using a simple JDK Queue has three drawbacks:
Producer and consumer are tightly coupled because the queue lives in the producer’s memory.
Messages can be lost if the machine crashes, as the queue is in‑memory only.
Each consumer would need its own queue to avoid message loss, leading to duplication and high implementation complexity.
Broker
Bill and Zhang decide to design an independent message broker that sits between producers and consumers, solving the coupling problem.
The broker must satisfy:
Message persistence : write messages to disk (e.g., a file) so they survive broker crashes.
High availability : the broker must remain reachable even if a node fails.
High performance : achieve at least 100 k TPS, which requires fast producer writes, fast disk persistence, and fast consumer reads.
Messages are appended sequentially to a commitlog file. Sequential writes give near‑memory write speed.
To avoid costly disk I/O on every read, the broker uses the OS page cache and mmap to map files directly into memory.
Page Cache
When a file is accessed, the kernel loads its blocks into page‑cache (4 KB pages). Reads first check the cache; if missing, a page‑fault loads the block. Writes go to page‑cache first and are later flushed to disk.
mmap
mmap maps a file into a process’s virtual address space, allowing the program to read/write the file directly in memory without an extra copy between kernel and user space.
Advantages:
Eliminates the user‑space copy, saving CPU cycles.
Shares the same page‑cache between kernel and user, reducing memory usage.
Drawbacks include fixed file size, mapping overhead (vm_area_struct structures), page‑fault cost for large files, and the need for contiguous virtual address space.
ConsumeQueue File
Because messages have variable sizes, a separate ConsumeQueue index file stores, for each message, a fixed‑size record (20 bytes) containing commit offset (8 B), size (4 B), and tag hashcode (8 B). This enables fast random access via array indexing.
Both commitlog and ConsumeQueue files are memory‑mapped, so reads and writes happen in page‑cache, eliminating disk latency as long as the data stays cached.
Topic & Tag
All messages are persisted in the same commitlog, but they are categorized by Topic (business type) and further by Tag (sub‑type). Producers specify topic, queueId, and tag; the broker writes the tag’s hashcode into the ConsumeQueue, allowing consumers to filter by topic + tag efficiently.
Broker High Availability
To avoid a single point of failure, brokers are deployed in a master‑slave (or multi‑master) configuration. The master handles client traffic; slaves replicate logs. After RocketMQ 4.5, a Raft‑based dledger mode provides automatic leader election among at least three nodes.
Nameserver
Instead of hard‑coding broker addresses, a set of Nameservers acts as a service registry. Brokers periodically register their routing information (topic list, queue counts, etc.) with the nameservers. Producers and consumers pull this routing data from the nameservers, enabling automatic discovery, dynamic scaling, and failover handling.
Summary
RocketMQ’s design focuses on three goals: durable message storage, high performance, and high availability. By using a sequential commitlog, memory‑mapped files, page‑cache, and a decoupled broker layer, the system achieves low latency and high throughput. Topic/Tag routing and a nameserver provide flexible, scalable message distribution, while master‑slave or Raft‑based clusters ensure resilience.
Sohu Tech Products
A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.