Fundamentals 7 min read

How Paxos Guarantees Consistency in Distributed Systems – A Deep Dive

This article explains the Paxos consensus algorithm, detailing its roles, three-phase execution process, key properties such as consistency, availability and fault tolerance, and showcases its practical applications in distributed databases, file systems, and coordination services.

Ops Development & AI Practice
Ops Development & AI Practice
Ops Development & AI Practice
How Paxos Guarantees Consistency in Distributed Systems – A Deep Dive

Introduction

In distributed systems, achieving consensus among replicas is essential for consistency. Paxos, introduced by Leslie Lamport in the 1990s, provides a fault‑tolerant way for a majority of nodes to agree on a single value despite crashes or network partitions.

Roles

Proposer : generates a proposal with a unique, monotonically increasing proposal number and tries to obtain acceptance from a majority of acceptors.

Acceptor : promises not to accept proposals with lower numbers than the highest one it has promised, and may accept a proposal that matches its promise.

Learner : receives commit notifications and learns the final decided value.

Protocol Phases

1. Prepare Phase

The proposer selects a new proposal number (e.g., a timestamp combined with its node ID) and sends a Prepare request to a majority of acceptors.

Each acceptor compares the received number with any previously promised number. If the new number is higher, the acceptor replies with a Promise and includes the highest-numbered proposal it has already accepted, if any.

2. Accept Phase

After collecting a majority of promises, the proposer chooses a value:

If any acceptor reported an already‑accepted proposal, the proposer must adopt the value of the highest-numbered such proposal.

Otherwise it may propose its own value.

The proposer then sends an AcceptRequest (containing the proposal number and chosen value) to the same majority of acceptors.

An acceptor that receives an AcceptRequest whose number matches its most recent promise records the value as accepted and replies with an Accepted message.

3. Commit Phase

When the proposer receives Accepted responses from a majority, it considers the proposal committed.

The proposer broadcasts a Commit message to all learners (and optionally to all acceptors) indicating the final value.

Learners update their state with the committed value.

Key Properties

Safety (Consistency) : No two distinct values can be committed; all correct nodes eventually see the same value.

Liveness (Availability) : As long as a majority of nodes are reachable and can communicate, progress is possible.

Fault Tolerance : The algorithm tolerates up to ⌊(N‑1)/2⌋ crash failures in a system of N nodes.

Typical Deployments

Distributed Databases : Google Spanner uses Paxos to replicate data across data centers, ensuring strong consistency.

Distributed File Systems : Ceph’s metadata service employs a modified Paxos variant to keep file‑system metadata synchronized.

Coordination Services : Apache ZooKeeper implements the Zab protocol, which is derived from Paxos, for leader election and configuration management.

Paxos diagram
Paxos diagram

References

Lamport, L. (1998). The Part‑Time Parliament. ACM Transactions on Computer Systems , 16(2), 133‑169.

Chandra, T. D., Griesemer, R., & Redstone, J. (2007). Paxos Made Live – An Engineering Perspective. Proceedings of the 26th ACM Symposium on Principles of Distributed Computing.

Google Spanner Documentation. https://cloud.google.com/spanner/

Ceph Documentation. https://docs.ceph.com/docs/master/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed Systemsdatabasefault toleranceConsistencyConsensus AlgorithmPaxos
Ops Development & AI Practice
Written by

Ops Development & AI Practice

DevSecOps engineer sharing experiences and insights on AI, Web3, and Claude code development. Aims to help solve technical challenges, improve development efficiency, and grow through community interaction. Feel free to comment and discuss.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.