How RocketMQ’s CommitLog Powers Million‑Level Concurrency

This article explains how RocketMQ’s CommitLog architecture—sequential writes, mmap zero‑copy, PageCache acceleration, fixed‑size log files, flexible flushing strategies, and efficient ConsumeQueue indexing—enables the system to sustain million‑level QPS with high reliability and low latency.

Lobster Programming
Lobster Programming
Lobster Programming
How RocketMQ’s CommitLog Powers Million‑Level Concurrency

RocketMQ is an open‑source distributed messaging middleware from Alibaba, designed for high throughput, high reliability, and low‑latency message delivery. This article analyzes how RocketMQ leverages the CommitLog to sustain million‑level concurrency.

1. Understanding CommitLog

CommitLog stores message metadata; all messages are appended sequentially to CommitLog files. Producers send messages to the broker, and messages for all topics are appended in order to the CommitLog. Each CommitLog file is 1 GB (configurable). The files are named by their starting offset, e.g., 00000000000000000000 for the first file, 00000000001073741824 for the second, and so on.

2. Reasons for Million‑Concurrency Capability

(1) Sequential Write

RocketMQ writes all messages sequentially into a single CommitLog file, avoiding random writes across multiple files. This eliminates the random‑write performance bottleneck of file switching and significantly improves write efficiency.

(2) Zero‑Copy (mmap)

RocketMQ uses memory‑mapped files (mmap) to operate directly on the file’s physical memory, bypassing kernel buffer copies and reducing data movement between memory and disk, thereby increasing throughput.

(3) PageCache Acceleration

PageCache caches frequently accessed file content in memory, reducing disk I/O for CommitLog reads and writes, which improves overall system performance.

(4) Fixed‑Size CommitLog File Management

Each CommitLog file defaults to 1 GB. When a file is full, RocketMQ automatically creates a new file, and at any moment only one file is writable. This simplifies file management, ensures data consistency through sequential writes, avoids fragmentation, and improves storage efficiency.

(5) Flushing Mechanism

RocketMQ provides synchronous and asynchronous flushing. Synchronous flushing offers higher reliability at the cost of performance, while asynchronous flushing uses replication to guarantee data durability with higher throughput. By default, RocketMQ uses asynchronous flushing.

(6) Efficient ConsumeQueue Indexing

ConsumeQueue stores only the offset of messages in the CommitLog and queue information, acting as a logical index file. Sequential reads combined with PageCache acceleration allow consumers to avoid frequent disk access, relieving pressure on CommitLog reads.

IndexFile, about 400 MB in size, can store roughly 20 million indexes and provides key‑ or time‑range based message lookup. It is implemented as a hash index.

Conclusion

Through sequential writes, zero‑copy, PageCache acceleration, fixed‑size file management, flexible flushing, and efficient consumption indexing, RocketMQ maximizes CommitLog read/write performance, enabling the system to achieve million‑level QPS throughput.

distributed-systemsRocketMQzero-copyPageCacheCommitLog
Lobster Programming
Written by

Lobster Programming

Sharing insights on technical analysis and exchange, making life better through technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.