Fundamentals 15 min read

Understanding Paxos: How Distributed Systems Reach Consensus

This article provides a vivid explanation of the Paxos algorithm, illustrating how it achieves reliable consensus among unreliable processors through a two‑phase prepare/promise and propose/accept process, using distributed auction analogies, message sequencing, and read/write operations to ensure consistency in distributed systems.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Understanding Paxos: How Distributed Systems Reach Consensus

This is a vivid explanation and demonstration of the Paxos algorithm, which can achieve reliable consensus even under highly unreliable network conditions.

What is consensus?

In distributed systems, network communication may fail, making it difficult for computers to agree on a single value. The article uses a scenario where three computers (X, Y, Z) must agree on an attack time, illustrating the challenges of reaching consensus when messages can be delayed or lost.

Because a machine cannot be sure whether another is still alive or merely delayed, it cannot reliably determine if the others are available, leading to uncertainty in coordination.

Paxos solution approach

Paxos solves the consensus problem and is used in systems such as Cassandra, Google Spanner, and Chubby. It ensures high availability and strong consistency in distributed systems.

Paxos completes a write operation in two rounds: prepare/promise and propose/accept.

First, the leader sends a prepare request to all servers; if a majority promise, they are ready to accept a value. Second, the leader sends a propose request; if a majority accept, the write succeeds.

Key terms:

A process is a computer in the system.

A client is a member that queries or updates the system value.

Paxos read operation

To read a value, a client requests the current value from all processes. If a majority return the same value, the read succeeds; otherwise it fails.

Unlike single‑node systems where a client reads directly, Paxos requires contacting a majority because there is no single authoritative storage node.

Paxos write operation

When a client wants to write a new value, Paxos ensures the value is proposed to the cluster and accepted by a majority, preventing a single point of failure.

The write involves two phases: first, a prepare/promise phase where the proposer asks servers to promise not to accept lower‑numbered proposals; second, a propose/accept phase where the value is sent to servers that have promised.

Sequence number

Each proposal carries a unique sequence number generated by the proposer. Servers use this number to determine which proposal is newer and should be accepted.

If a server sees a higher sequence number than any it has seen before, it discards older proposals.

Paxos first phase: Prepare/Promise

During the prepare phase, the proposer sends a prepare message with a sequence number to all servers. Servers that have not seen a higher number reply with a promise not to accept lower‑numbered proposals.

The proposer counts the promises; if a majority is received, it proceeds to the next phase. Otherwise, the proposal fails.

Second phase: Acceptance

After obtaining a majority of promises, the proposer sends an accept request with the value. Servers that have promised will accept the value unless they have already promised a higher sequence number.

If enough servers accept, the value is committed; otherwise the operation may need to be retried.

Summary

Paxos guarantees that write operations in a distributed system achieve consensus among a majority of nodes, prioritizing consistency over availability. In contrast, vector clocks favor availability but require manual conflict resolution.

Both Paxos and vector clocks were introduced by Leslie Lamport.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

algorithmfault tolerancedistributed consensusRead/Write
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.