Deep Dive into RocketMQ Architecture, Principles, and Source Code
This article provides a comprehensive overview of RocketMQ's evolution, core architecture, communication flow, storage mechanisms, flushing strategies, high‑availability synchronization, and transaction message implementation, illustrating each component with diagrams and key source‑code excerpts for developers seeking an in‑depth understanding of the system.
1. History of RocketMQ
RocketMQ originated from Alibaba's internal messaging projects, evolving from Notify (2007) to MetaQ (2011) and finally RocketMQ 3.0 (2012), with later commercial products such as Aliware MQ and the open‑source donation to Apache in 2016.
2. Core Architecture
The cluster consists of NameServer (service discovery), Broker‑Master/Slave (message hosts), Producer (message sender) and Consumer (message receiver). Communication between these roles follows a request‑response model built on the rocketmq‑remoting module.
3. Client‑Side Flow
Messages are sent via MQProducer, which calls MQClientAPIImpl.sendMessage(). This method eventually invokes RemotingClient.invokeAsync() of the Netty‑based NettyRemotingClient to transmit the request to a broker.
4. Remoting Layer
The rocketmq‑remoting module defines the network protocol and uses Netty for asynchronous I/O. NettyRemotingAbstract.invokeAsyncImpl() writes the request to a channel and waits for the selector to process it.
5. Broker Processing
On the broker side, SendMessageProcessor.processRequest() delegates to DefaultMessageStore.putMessage(), which finally calls CommitLog.putMessage() to append the message to a mapped file.
6. Storage Layer
RocketMQ stores data in three large file types (CommitLog, ConsumeQueue, Index) using MappedByteBuffer for high‑performance I/O. DefaultAppendMessageCallback.doAppend() writes the message bytes and returns an AppendMessageResult. After writing, the system performs flushing and optional HA synchronization.
7. Flushing Strategies
Three flushing modes exist: synchronous (force to disk on each write), asynchronous real‑time (periodic FlushRealTimeService), and commit‑real‑time (buffered writes with CommitRealTimeService when transientStorePoolEnable=true).
8. High‑Availability (HA)
Two replication modes are supported: SYNC_MASTER (master waits for slave acknowledgment) and ASYNC_MASTER (master returns immediately). The GroupTransferService handles synchronous replication, while asynchronous replication uses a background HAConnection thread.
9. Transaction Messages
RocketMQ implements distributed transactions via a two‑phase protocol: a prepare message (invisible to consumers), execution of the local transaction, and a commit or rollback command sent back to the broker. If the broker does not receive a final decision, it performs a status check and compensates accordingly.
10. Summary
RocketMQ’s performance hinges on its efficient Netty‑based remoting, memory‑mapped file storage, configurable flushing/HA mechanisms, and robust transaction support, making it a powerful backend messaging solution for large‑scale distributed systems.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
