How Raft Guarantees Consistent Log Replication and Leader Election
This article explains the Raft consensus algorithm, detailing how it breaks consistency into leader election, log replication, and safety, and describes the roles, term handling, election rules, log matching, snapshotting, and the RPC mechanisms that ensure a reliable distributed state machine.
Background
Consistency algorithms were introduced for replicated state machines, and Raft was proposed to manage log replication across multiple replicas.
Leader Election
Raft divides the consistency problem into three sub‑problems, the first being leader election. The cluster has one Leader and the rest are Followers . If a client contacts a follower, it redirects the request to the leader.
When a leader fails, followers become Candidates and start an election. Raft defines a term (a monotonically increasing number). Each term begins with an election; the winner serves as leader for the rest of the term.
Election rules:
If a server receives a vote request with a higher term, it votes for that server.
Within a term, a server may vote for at most one candidate, on a first‑come‑first‑served basis.
To avoid split votes, Raft uses random election timeouts (e.g., 150‑300 ms). Each server picks a timeout in this range; the one whose timeout expires first becomes candidate and usually wins the election.
Log Replication
Raft defines a log where entries are appended. An entry is committed once it is stored on a majority of servers. The Log Matching Property states that if two entries have the same index and term, they store the same command, and all preceding entries are identical.
Replication flow:
Client sends a request to the leader; the leader appends the entry to its log.
The leader concurrently sends AppendEntries RPCs to followers.
After the entry is safely replicated on a majority, the leader applies it to its state machine and replies to the client.
The leader then informs followers that the entry can be committed.
If the leader crashes before committing, a new election occurs; the new leader continues replication. Raft requires RPCs to be idempotent, so servers must deduplicate duplicate requests.
Raft also handles network partitions that may create temporary dual leaders; once the partition heals, the system converges to a single leader.
Safety Guarantees
Raft enforces several safety mechanisms:
Election Safety : at most one leader can be elected per term.
Leader Append‑Only : leaders never overwrite or delete log entries; they only append.
Log Matching : identical index‑term pairs guarantee identical logs up to that point.
Leader Completeness : any entry committed in a given term will appear in the logs of all later leaders.
State Machine Safety : once an entry is applied to a state machine at a given index, no other server will apply a different entry at that index.
Raft compares the last log index and term of candidates to prevent a leader with an outdated log from winning an election.
Snapshots
To prevent unbounded log growth, Raft allows each follower to take a snapshot of its state, discarding older log entries. Snapshots contain the last committed log index/term and the current state machine snapshot. Excessive snapshotting can waste disk bandwidth, while infrequent snapshots may delay recovery.
Key RPCs and Principles
RequestVote RPC – used by candidates to solicit votes.
AppendEntries RPC – used by leaders to replicate log entries and send heartbeats.
Role Behaviors
All servers :
If commitIndex > lastApplied, increment lastApplied and apply log[lastApplied] to the state machine.
If an RPC contains a term larger than currentTerm, update currentTerm and become a follower.
Followers :
Respond to RPCs from candidates and leaders.
If no heartbeat or vote request is received before the election timeout, become a candidate.
Candidates :
Start an election: increment currentTerm, vote for self, reset election timer, send RequestVote RPCs.
Become leader if receiving votes from a majority.
Revert to follower upon receiving a valid AppendEntries RPC from a new leader.
Restart election if timeout expires without winning.
Leaders :
Send empty AppendEntries RPCs (heartbeats) periodically to prevent elections.
On client request, append entry to local log, replicate it, apply to state machine once committed, and respond to client.
Maintain nextIndex[] (next log entry to send to each follower) and matchIndex[] (highest log index known to be replicated on each follower).
If AppendEntries fails due to log inconsistency, decrement nextIndex and retry.
Images in the original article illustrate the leader election process, term structure, log replication flow, and safety diagrams.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Thoughts on Knowledge and Action
Travel together, with knowledge and action all the way
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
