Understanding Raft: A Beginner’s Guide to Distributed Consensus
An in‑depth overview of the Raft consensus algorithm explains its server states, RPC mechanisms, leader election process, log replication workflow, safety guarantees, and how the protocol handles failures, illustrated with diagrams and practical examples.
Related Introduction
Raft cluster servers have three states: Leader, Candidate, and Follower. Followers are passive, only responding to leaders or candidates. Leaders handle all client requests; if a follower receives a client request it redirects to the leader. When a follower does not receive heartbeats or vote RPCs for a timeout, it becomes a candidate and starts an election.
Raft uses two RPC types: RequestVote and AppendEntries.
RequestVote RPC : sent by a candidate during an election. A follower votes only if the candidate’s term is greater than its current term and the candidate’s log is at least as up‑to‑date as its own.
AppendEntries RPC : sent by the leader to replicate log entries and act as a heartbeat. It includes several safety checks: term must be current, previous log entry must match, conflicting entries are deleted, commit index is advanced, and an empty AppendEntries is sent after a new leader is elected.
Leader Election
Raft elects a single leader to manage log replication. Followers that timeout become candidates, increment their term, and request votes. An election succeeds when a candidate receives votes from a majority of servers, another leader is elected, or the timeout expires without a leader.
Each election term is an integer that increases monotonically. A random election timeout (e.g., 150‑300 ms) is chosen to reduce the chance of split votes.
Log Replication
Raft follows a strong leader model: only the leader appends entries to its log and replicates them to followers via AppendEntries RPCs. A log entry (LogEntry) contains an index, term, and command.
Typical replication steps:
Client sends a request to the leader.
Leader appends the command as a new LogEntry to its own log.
Leader sends AppendEntries RPCs; once a majority of followers have stored the entry, the leader commits it and replies to the client.
Subsequent AppendEntries RPCs inform other followers to commit the entry.
Key points: identical committed entries have the same index, term, and command across a majority; a committed entry implies all preceding entries are also committed; leaders never overwrite their own log; committed entries persist across leader changes; leaders retry RPCs until a majority respond; inconsistencies are resolved by decrementing nextIndex until logs match.
Safety
Raft adds several restrictions to guarantee safety:
Election restriction : a candidate must have a log at least as up‑to‑date as any voter’s log; voters reject candidates with older logs.
Previous‑term uncommitted logs : only entries from the current term can be committed by counting replicas. Entries from older terms may be replicated but are not committed until a leader from the current term confirms them.
Follower/Candidate crashes : RPCs are idempotent; crashed nodes retry upon restart, ensuring progress.
Client interaction : each client command carries a unique sequence number so duplicate executions can be detected and avoided.
Operational requirements : safety holds under non‑Byzantine conditions (network delay, partition, loss, reordering) as long as a majority of servers are available and broadcast time << election timeout << mean time between failures.
Reference
Animation demo: http://thesecretlivesofdata.com/raft/
Original paper: https://ramcloud.atlassian.net/wiki/download/attachments/6586375/raft.pdf
Chinese translation: https://github.com/maemual/raft-zh
Implementations: https://raft.github.io/#implementations
Additional articles: https://toutiao.io/posts/hdufp0/preview, https://juejin.im/entry/5b74e1f0f265da283479709f
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
