Backend Development 7 min read

Understanding RocketMQ Master‑Slave Architecture and High‑Availability Mechanisms

This article explains how RocketMQ achieves high availability and data reliability through its master‑slave broker design, covering synchronous and asynchronous replication, flush strategies, transaction messaging, automatic failover with Dledger, and read‑write separation for load balancing in distributed systems.

Cognitive Technology Team
Cognitive Technology Team
Cognitive Technology Team
Understanding RocketMQ Master‑Slave Architecture and High‑Availability Mechanisms

In modern distributed systems, message middleware is a core component, and its high availability is crucial. RocketMQ, a high‑performance message middleware, achieves high availability and data reliability through its unique master‑slave architecture.

Master‑Slave Overview RocketMQ brokers are divided into Master and Slave roles. The Master receives messages from producers and synchronizes them to the corresponding Slave node. The Slave acts as a backup and can take over services when the Master fails, providing high‑availability guarantees through data redundancy and failover.

Master Node Handles message writes, consumption scheduling, metadata management, and serves as the cluster’s “brain”.

Slave Node Continuously synchronizes data from the Master as a hot‑standby replica; when the Master crashes, the Slave can switch to read‑write mode and assume service responsibilities.

Data Synchronization Mechanisms

Synchronous Replication After a producer sends a message, the Master stores it and only returns a success response after the data has been synchronized to the Slave. This guarantees strong consistency but adds write latency, making it suitable for scenarios with stringent consistency requirements such as financial services. The process is: 1. Producer sends a message to the Master, which writes it to the local CommitLog. 2. The Master pushes the log to the Slave via a Netty channel. 3. The Slave writes the log to disk and acknowledges the Master. 4. The Master receives the ACK and replies to the producer with a success response.

Asynchronous Replication The Master stores the message and immediately acknowledges the producer, while the synchronization to the Slave occurs asynchronously. This mode offers lower latency and higher throughput but carries a risk of data loss if the Master fails before the data is replicated. The process is: 1. Master writes to the local CommitLog and returns success to the producer. 2. The Slave periodically (default 50 ms) pulls incremental logs from the Master. 3. The Slave writes the pulled logs to its own CommitLog and updates its sync status.

Data Consistency Guarantees – Flush Policies RocketMQ provides two flush modes: SYNC_FLUSH (synchronous) and ASYNC_FLUSH (asynchronous). SYNC_FLUSH ensures each message is safely persisted to disk before acknowledgment, offering the highest reliability for workloads demanding strict consistency. ASYNC_FLUSH writes messages to the page cache first and flushes to disk periodically, improving performance by about 50 % but risking data loss on power failure.

Recommendations For core business scenarios, use synchronous flush combined with synchronous double‑write; for non‑core workloads, use asynchronous flush together with asynchronous replication.

Transactional Message Mechanism Ensures atomicity between local transactions and message sending; if the local transaction succeeds, the message is committed, otherwise it is rolled back.

Failover Mechanism Since RocketMQ 4.5, Dledger technology introduces automatic master‑slave switching using the Raft algorithm for leader election. When the Master node crashes, the remaining Slave nodes automatically elect a new Master, eliminating manual failover complexity and downtime.

Read‑Write Separation and Load Balancing Producers send messages to the Master, while consumers can pull from either the Master or a Slave. The Master suggests the next pull target based on current load, enabling load balancing and improved throughput.

Conclusion RocketMQ’s master‑slave architecture, combined with pull‑based synchronization, read‑write separation, and Dledger’s automatic failover, delivers a stable and efficient high‑availability messaging system that continues to evolve to meet enterprise‑level demands.

Distributed SystemsHigh AvailabilityMaster‑SlaveMessage QueueRocketMQdata replication
Cognitive Technology Team
Written by

Cognitive Technology Team

Cognitive Technology Team regularly delivers the latest IT news, original content, programming tutorials and experience sharing, with daily perks awaiting you.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.