Demystifying Raft: How Nacos Uses JRaft for Strong Consistency
This article explains the Raft consensus algorithm, its request lifecycle, leader election, snapshot mechanism, and JRaft optimizations such as linear reads, learners, and multi‑raft groups, illustrating how Nacos integrates these concepts to achieve reliable distributed consistency.
Simple Introduction
Raft protocol is used to guarantee strong consistency (CP in the CAP theorem) across service nodes in a cluster.
In Raft, each node can be in one of three states:
Leader: the sole node that handles all requests, holding a 唯我独尊 status.
Follower: replicates the Leader's data and must obey the Leader.
Candidate: an intermediate state that eventually becomes a Leader or a Follower.
When a node starts, it registers itself with the Leader using JRaft’s API CliService#addPeer .
A Request’s Life Cycle
All requests must be processed by the Leader. If a node receives a request and is not the Leader, it forwards the request to the Leader.
The processing consists of two parts: Raft’s handling of the request to ensure consistency, and the application service’s actual business logic.
Raft first wraps the request data into LogEntry objects, assigns each an incremental index, and stores them on disk, similar to Redis’s AOF log.
These LogEntries are then replicated to Followers via the AppendEntries mechanism. Once a majority of nodes have successfully replicated a LogEntry, its index becomes the CommitIndex , and the request can be applied by the application service.
1. Raft Handling of Requests
1.1 Storing Request Logs
The Leader creates a LogEntry for each request, assigns a unique index, and persists it to disk.
1.2 Replicating Logs to Followers
The Leader creates a Replicator that continuously sends LogEntries to Followers, which also store them on disk.
1.3 Committing Logs After Majority Replication
After each batch replication, the Leader determines the highest LogEntry index replicated by a majority of nodes; this is the CommitIndex . Only entries up to this index are safe to apply.
2. Application Service Processing
The application must implement a Raft StateMachine interface to receive committed entries. Nacos implements this interface, allowing it to process requests such as permanent instance registration.
Snapshot (Snapshot)
Raft does not rewrite LogEntry files to shrink them. Instead, it periodically creates a snapshot of the in‑memory business data, which is stored on disk.
JRaft triggers snapshot creation every hour by calling the StateMachine’s onSnapshotSave method. Nacos implements this to persist instance data.
After a snapshot, all preceding LogEntries can be safely deleted, reducing disk usage and speeding up recovery.
Election Algorithm
1. Election Term and Timing
A term (election cycle) increments with each election round. Within a term, each node can vote once, and only one Leader can exist.
2. StepDown
If a node discovers a higher term in the cluster, it performs a StepDown: updates its term, reverts to Follower, and restarts its election timeout.
3. Pre‑election
JRaft adds a pre‑election phase to reduce election conflicts. Nodes first check if they are likely to win before starting a formal election.
4. Formal Election
During a formal election, a Candidate increments its term, votes for itself, and requests votes from other nodes. If a majority approves, it becomes the Leader.
JRaft Optimizations
1. Linearizable Reads
Read requests can be served by Followers once they have applied entries up to the Leader’s current index, reducing load on the Leader.
2. Learner Nodes
Learners are read‑only members that replicate data from the Leader but do not participate in elections or voting.
3. Multi‑Raft‑Group
JRaft supports multiple independent Raft groups (tenants), each with its own Leader, allowing write load to be distributed across machines.
Sanyou's Java Diary
Passionate about technology, though not great at solving problems; eager to share, never tire of learning!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.