Why Raft Beats Paxos and EPaxos: A Deep Dive into Distributed Consensus
This article explores the evolution of distributed consensus—from Paxos to Multi‑Paxos, Raft, and EPaxos—examining their mechanisms, understandability, efficiency, availability, and suitable scenarios, while providing comparative analysis and thought‑provoking questions for practitioners in modern cloud systems.
Introduction
Distributed consensus (Consensus) is the cornerstone of distributed systems. As systems scale, the demand for high availability and strong consistency grows, leading to widespread industrial adoption of consensus protocols. This article traces the evolution from the original Paxos to the popular Raft and the emerging leaderless EPaxos, comparing them from technical perspectives.
What is Distributed Consensus?
In simple terms, distributed consensus ensures that after one or more processes propose a value, all processes in the system agree on that same value.
Paxos
Paxos reaches a decision through two phases: Prepare and Accept .
Prepare: proposers compete for the right to propose; only after winning can they move to the Accept phase.
Accept: the proposal gains a majority, forming a decision that can be learned by all correct processes.
Basic Paxos requires at least two network round‑trips per decision and can suffer from livelocks under high concurrency.
Multi‑Paxos
Multi‑Paxos introduces a stable leader to reduce the two‑phase overhead. The leader handles proposals, allowing the Prepare phase to be skipped after the first election, effectively turning the protocol into a single‑phase operation for subsequent proposals. It does not assume a unique leader; multiple leaders may propose concurrently, falling back to Basic Paxos in worst cases.
Raft
Raft was designed from the viewpoint of a replicated state machine, using stronger assumptions to simplify the protocol and make it easier to understand and implement.
Similar concepts between Raft and Multi‑Paxos:
Raft’s Leader = Multi‑Paxos’s Proposer.
Raft’s Term = Multi‑Paxos’s Proposal ID.
Raft’s Log Entry = Multi‑Paxos’s Proposal.
Raft’s Log Index = Multi‑Paxos’s Instance ID.
Raft’s leader election = Multi‑Paxos’s Prepare phase.
Raft’s log replication = Multi‑Paxos’s Accept phase.
Key differences:
Raft assumes a single strong leader at any time; proposals must come from the leader, ensuring safety.
Multi‑Paxos allows multiple leaders (weak leaders) to propose concurrently, improving efficiency but not guaranteeing that only the leader can propose.
Raft enforces log continuity and uses a commit index to notify followers, while Multi‑Paxos may have gaps and requires an extra commit message.
EPaxos
EPaxos (Egalitarian Paxos) is a leaderless consensus algorithm introduced at SOSP’13. Any replica can commit a log entry, typically requiring one or two network round‑trips.
Advantages of EPaxos include:
No leader election overhead, leading to higher availability.
Balanced load across replicas, eliminating leader bottlenecks.
Clients can contact the nearest replica, reducing latency in cross‑AZ or cross‑region deployments.
EPaxos decides the order of instances dynamically at runtime, using a graph‑based approach: logs are graph nodes, ordering relations are edges, and a topological sort determines the final sequence. It introduces a Fast Path (PreAccept) when there is no conflict and a Slow Path (Accept) when conflicts exist.
Comparative Analysis
Understandability : Paxos is notoriously hard to grasp; Raft was created to be easy to understand and implement, gaining rapid adoption. EPaxos, despite being earlier than Raft, remains difficult to comprehend, limiting its engineering use.
Efficiency (load balancing, message complexity, pipeline, concurrency):
Load balancing – EPaxos distributes load evenly; Raft and Multi‑Paxos concentrate load on the leader.
Message complexity – Raft requires the fewest messages, Paxos is next, EPaxos can require more due to conflict handling.
Pipeline – Multi‑Paxos and EPaxos support out‑of‑order pipelines; Raft traditionally supports only in‑order pipelines unless additional mechanisms are added.
Concurrency – Paxos retries on conflict, Raft avoids conflict via a strong leader, EPaxos resolves conflicts directly, offering higher parallelism.
Availability : EPaxos can serve from any replica, providing superior availability. Between Raft and Multi‑Paxos, Multi‑Paxos’s weak‑leader model can recover faster after a leader failure, giving it a slight edge over Raft’s strong‑leader lease mechanism.
Applicable Scenarios : EPaxos shines in cross‑AZ/region deployments with stringent availability requirements and where leader bottlenecks are a concern. Multi‑Paxos and Raft suit typical intra‑datacenter high‑availability use cases.
Thought Questions
1) Does a Paxos Proposal ID need to be globally unique, and what happens if it isn’t?
2) What are the correctness implications of merging Max Proposal ID and Accepted Proposal ID in Paxos?
3) What role does Raft’s PreVote play, and is it always necessary?
Images
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
