Understanding the Raft Consensus Algorithm: Roles, Leader Election, and Fault Handling
This article explains the Raft consensus algorithm, detailing its roles, leader election process, term management, fault handling, and how it ensures consistency in both single‑node and multi‑node distributed systems for modern cloud‑native applications.
1. Raft Overview
The Raft 算法 is the preferred 共识算法 for distributed system development, used in popular projects such as Etcd and Consul.
Mastering this algorithm makes it easier to handle most fault‑tolerance and consistency requirements, such as distributed configuration systems and distributed NoSQL storage, thereby overcoming single‑machine limitations.
Raft achieves consensus by following a leader‑centric approach, ensuring log consistency across all nodes.
2. Raft Roles
2.1 Roles
Follower : Acts as a regular node that passively receives messages from the leader; if the leader’s heartbeat times out, it may promote itself to candidate.
Candidate : Requests vote RPCs from other nodes; if it wins a majority, it becomes the leader.
Leader : The "dominant" node that handles write requests, manages log replication, and continuously sends heartbeat messages to inform other nodes of its authority.
The diagram below illustrates the three roles.
3. Single‑Node System
3.1 Database Server
Imagine a single‑node system where the node acts as a database server storing a value X.
3.2 Client
The green solid circle on the left represents the client, and the blue solid circle on the right represents node a (Node a). “Term” denotes the election term, which will be explained later.
3.3 Client Sends Data to Server
The client sends an update operation that sets the stored value to 8. In a single‑node environment, the client receives the same value 8, making consistency trivial.
3.4 How Does a Multi‑Node System Ensure Consistency?
When multiple server nodes exist (e.g., nodes a, b, c), they form a database cluster. The client updates all three nodes, and Raft ensures that the stored values remain consistent across the cluster.
In a multi‑node cluster, Raft guarantees that at any given time there is only one leader, even in the presence of node failures or network partitions.
4. Leader Election Process
4.1 Initial State
Initially, all nodes are followers.
The diagram shows three nodes (a, b, c) with term 0.
4.2 Becoming a Candidate
Each node uses a random election timeout. In the example, node A times out first (150 ms) and becomes a candidate, incrementing its term from 0 to 1 and voting for itself.
Node A: Term = 1, Vote Count = 1.
Node B: Term = 0.
Node C: Term = 0.
4.3 Voting
Node A, now a candidate, sends RequestVote RPCs to the other nodes.
Step 1 : Candidate A requests votes.
Step 2 : Nodes B and C, having not voted in term 1, grant their votes to A and update their term.
Step 3 : A receives votes from a majority and becomes the new leader.
Step 4 : Leader A periodically sends heartbeat messages to B and C.
Step 5 : B and C acknowledge the heartbeat.
4.4 Terms
A term ("term") is the period during which a leader serves.
Automatic Increment : Followers that time out become candidates and increment their term.
Update to Larger Value : If a node learns of a higher term, it updates its own term.
Revert to Follower : A candidate or leader that discovers a higher term steps down to follower.
Reject Stale Requests : Nodes reject RPCs with lower term numbers.
4.5 Election Rules
Within a term, the elected leader remains until it fails or a network issue triggers a new election.
Each server can cast at most one vote per term.
4.6 Majority
For a cluster of N nodes, a majority is at least ⌊N/2⌋ + 1 (e.g., 2 out of 3).
4.7 Heartbeat Timeout
Randomized election timeouts prevent simultaneous elections, ensuring that typically only one node initiates the election first.
5. Leader Failure
If the leader fails, a new election is triggered. The diagram shows leader A failing, after which nodes B and C elect a new leader.
Step 1 : Leader A fails; B and C stop receiving heartbeats.
Step 2 : Node C times out first and becomes a candidate.
Step 3 : C requests votes from A and B.
Step 4 : C receives votes from B (A cannot respond).
Step 5 : C obtains a majority and becomes leader.
Step 6 : C sends heartbeats to B (A remains silent).
6. Summary
Raft ensures a single leader per term and reduces election failures through:
Term management
Leader heartbeat messages
Randomized election timeouts
First‑come‑first‑served voting
Majority vote requirement
This article uses animated diagrams to make the Raft algorithm’s leader election process easier to understand and digest.
Wukong Talks Architecture
Explaining distributed systems and architecture through stories. Author of the "JVM Performance Tuning in Practice" column, open-source author of "Spring Cloud in Practice PassJava", and independently developed a PMP practice quiz mini-program.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.