Understanding the Paxos Consensus Algorithm: Principles, Roles, and Derivation
This article provides a comprehensive introduction to the Paxos consensus algorithm, explaining its purpose, core concepts, roles of proposers, acceptors and learners, safety and liveness properties, and the step‑by‑step derivation of its two‑phase protocol for achieving fault‑tolerant distributed consistency.
Paxos What Is It?
Paxos is a message‑driven, highly fault‑tolerant consensus algorithm that is widely regarded as one of the most effective solutions for achieving distributed consistency.
Proposed by Leslie Lamport, Paxos earned him the 2013 Turing Award and has become synonymous with distributed consensus, being used in many Google systems such as Chubby, Megastore, Spanner, as well as open‑source projects like ZooKeeper and MySQL Group Replication.
However, Paxos is often criticized for being difficult to understand and complex to implement.
Background of the Problem
In typical distributed systems, failures such as node crashes or network anomalies (delays, loss, duplication, reordering, partitions) can occur. Paxos aims to ensure that, despite these failures, all processes agree on a single value and that the system’s consistency is never violated.
The "value" may be a log entry, a command, or any application‑specific datum; its exact meaning depends on the context.
Related Concepts
Paxos defines three roles:
Proposer (提案者)
Acceptor (人大代表)
Learner (广大群众)
A single process can simultaneously act as multiple roles.
Another key notion is the proposal , which consists of a proposal number and a value.
Initial Understanding
The Paxos process is analogous to a legislative procedure: a proposer submits a proposal, acceptors vote, and learners learn the chosen value.
Problem Description
Given a set of proposers that can propose values, the algorithm must guarantee that exactly one value is chosen and that all learners agree on that value, even in the presence of crashes and message delays.
Safety requirements: (1) only proposed values may be chosen; (2) at most one value is chosen; (3) any process that believes a value is chosen must be correct.
Derivation Process
Simplest Scheme – One Acceptor
If there is only one acceptor, the first proposal it accepts becomes the chosen value, but the system fails if that acceptor crashes, so multiple acceptors are required.
Multiple Acceptors
With several acceptors, we need additional constraints to ensure a single value is chosen. The first constraint (P1) – an acceptor must accept the first proposal it sees – leads to inconsistency when different proposers send different values to different acceptors.
To avoid this, we add a second rule: a proposal is chosen only if a majority of acceptors accept it.
P2: If a proposal with value v is chosen, any higher‑numbered chosen proposal must also have value v.
We refine P2 into P2a (for acceptors) and later into P2b and P2c, which enforce that once a value is chosen, all later proposals with higher numbers must carry the same value, and that a majority of acceptors either have not seen lower‑numbered proposals or have already accepted the same value.
Proposer Generates Proposals
Before proposing, a proposer performs a "Prepare" phase to learn any previously accepted value. The algorithm proceeds as follows:
Proposer picks a new proposal number N and sends a Prepare request to a majority of acceptors. Each acceptor promises not to accept proposals numbered less than N. If it has already accepted a proposal, it returns the highest‑numbered accepted proposal.
If the proposer receives responses from a majority, it chooses value V = the value of the highest‑numbered returned proposal (or any value if none were returned) and sends an Accept request for [N, V] to a (possibly different) majority of acceptors.
Acceptor Accepts Proposals
An acceptor may ignore any request without breaking safety. It can accept a proposal only if it has not already responded to a Prepare request with a higher number.
Thus each acceptor needs to remember two pieces of state: the highest‑numbered proposal it has accepted and the highest Prepare request number it has responded to.
Paxos Algorithm Description
The algorithm consists of two phases:
Phase 1 (Prepare): Proposer sends a Prepare(N) to a majority; each acceptor replies with its highest accepted proposal (if any) and promises not to accept proposals numbered < N.
Phase 2 (Accept): If the proposer receives a majority of replies, it sends Accept(N, V) where V is the value from the highest‑numbered reply (or a new value). Acceptors that have not responded to a higher Prepare will accept the proposal.
Learner Learning the Chosen Value
Three possible approaches:
Scheme 1
Every acceptor forwards the accepted proposal to all learners (communication cost M×N).
Scheme 2
Acceptors send the proposal to a designated primary learner, which then notifies the others (cost M+N‑1, but introduces a single point of failure).
Scheme 3
A group of learners disseminates the proposal among themselves, improving reliability at the expense of higher complexity.
Ensuring Liveness
By electing a distinguished "primary proposer" that is the only entity allowed to issue proposals, the algorithm guarantees progress as long as the primary and a majority of acceptors remain reachable.
Summary
The article has detailed what Paxos is, its characteristics, and the step‑by‑step derivation of the algorithm, showing why Paxos and its variants remain fundamental to modern distributed consistency solutions.
Final Note (Promotion)
The author invites readers to follow the "Code Monkey Technical Column" public account to obtain PDF collections of "Spring Cloud Advanced", "Spring Boot Advanced", and "MyBatis Advanced" by replying with the respective keywords.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.