Backend Development 9 min read

How RocketMQ Achieves High Availability with Master‑Slave Replication

This article explains RocketMQ's master‑slave replication mechanism, comparing synchronous and asynchronous modes, detailing metadata and message data copying processes, and showing how synchronous guarantees are implemented using CompletableFuture to ensure high availability.

Sanyou's Java Diary
Sanyou's Java Diary
Sanyou's Java Diary
How RocketMQ Achieves High Availability with Master‑Slave Replication

RocketMQ master‑slave replication is one of its high‑availability mechanisms, allowing data to be copied from a master node to one or more slave nodes. This article explains the core concepts of master‑slave replication.

1. Synchronous vs Asynchronous Replication

In cluster mode, a broker is either a Master or a Slave; one Master can have multiple Slaves, but a Slave corresponds to a single Master. Masters handle client write requests and persist messages to disk, while Slaves replicate data from Masters.

1.1 Synchronous replication

Each Master configures a Slave; HA uses synchronous double‑write, meaning the client receives success only after both master and slave have written successfully.

Advantages: no single point of failure for data or service; if the Master fails, messages experience no delay and both service and data availability remain high.

Disadvantages: performance is about 10% lower than asynchronous mode; the current version cannot automatically promote a Slave to Master after a Master failure.

1.2 Asynchronous replication

Each Master configures a Slave; HA uses asynchronous replication, introducing a short (millisecond‑level) message delay.

Advantages: even if the disk is damaged, message loss is minimal; real‑time delivery is unaffected; after Master failure, consumers can continue reading from the Slave transparently, with performance similar to multi‑Master mode.

Disadvantages: if the Master fails and the disk is damaged, a small amount of messages may be lost.

Replication consists of two parts: metadata replication and message data replication.

2. Metadata Replication

Slave brokers run a scheduled task every 10 seconds to synchronize metadata—including topics, consumer progress, delayed consumption progress, and consumer configuration—from the Master. The Slave sends an RPC request, receives the data, stores it in a local cache, then persists it to disk.

3. Message Data Replication

The following steps describe the data flow:

Master starts and listens on a designated port, creating an AcceptSocketService to accept TCP connections.

HAConnection abstracts the connection; it launches a read thread (handling Slave requests) and a write thread (sending data to Slave).

Slave starts, creates a TCP connection to Master via HAClient , and reports its current offset ( currentReportedOffset ).

Slave reports the offset it wants to pull; the offset is a simple 8‑byte Long .

Master receives the request, triggers a SelectionKey.OP_READ event, and the ReadSocketService processes it, locating messages after the reported offset.

WriteSocketService retrieves the relevant messages from the commit log and sends them to the Slave.

Slave receives the data and appends it to its local commit log; HAClient.dispatchReadRequest parses the payload and writes it to storage.

4. How Synchronous Guarantees Are Implemented

Although the data copy itself is asynchronous, synchronous guarantees are achieved using CompletableFuture . After the Master’s commit log finishes appendMessage , it triggers a flush task and a synchronous replication task, both executed asynchronously. When the HAConnection read service receives the Slave’s progress feedback indicating successful data copy, it completes the future, allowing the Broker to assemble and return the response to the client.

5. Summary

RocketMQ’s master‑slave replication is straightforward: the Slave continuously pulls commit‑log data from the Master and builds the consume‑queue asynchronously. Key points:

Replication includes metadata and message data copying.

Metadata is synchronized every 10 seconds via RPC, cached, then persisted.

Message data replication follows a five‑step process: Master listens, Slave connects, Slave reports offset, Master retrieves and sends messages, Slave appends them locally.

Synchronous guarantees are realized with CompletableFuture that is completed when the Slave confirms successful copy.

high availabilityrocketmqMaster-Slave Replicationasynchronous replicationsynchronous replicationbackend messaging
Sanyou's Java Diary
Written by

Sanyou's Java Diary

Passionate about technology, though not great at solving problems; eager to share, never tire of learning!

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.