Fundamentals 11 min read

Understanding Consistency Algorithms: Paxos, Raft, ZAB, and Gossip

This article explains why data consistency is essential in distributed systems, defines consistency, compares strong and eventual consistency, and details the design and operation of major algorithms such as Paxos, Multi‑Paxos, Raft, ZAB, and Gossip with illustrative examples and diagrams.

Architecture Digest
Architecture Digest
Architecture Digest
Understanding Consistency Algorithms: Paxos, Raft, ZAB, and Gossip

Why Consistency Is Needed

Data must not reside on a single node (host) to avoid single‑point failures.

Multiple nodes need to hold identical data.

Consistency algorithms are created to solve these two problems.

Definition of Consistency Algorithm

Consistency means keeping data identical; in a distributed system it means the values stored on all nodes are the same.

Classification of Consistency

Strong Consistency

The system guarantees that once a change is committed, the cluster state changes immediately.

Paxos

Raft (multi‑Paxos)

ZAB (multi‑Paxos)

Weak Consistency (Eventual Consistency)

The system does not guarantee immediate state change after a commit, but the state will eventually become consistent over time.

DNS systems

Gossip protocol

Examples of Consistency Algorithm Implementations

Google's Chubby distributed lock service, which uses Paxos.

etcd distributed key‑value store, which uses Raft.

ZooKeeper coordination service, an open‑source implementation of Chubby, which uses ZAB.

Paxos Algorithm

Concept Introduction

Proposal: a modification request in the distributed system, represented as [proposal number N, value].

Client: the entity that submits proposals.

Proposer: forwards proposals from the client.

Acceptor: votes on proposals; it rejects any proposal with a number less than or equal to one it has already accepted.

Learner: records proposals that have been accepted.

Basic Paxos Algorithm

Steps:

Proposer prepares a proposal with number N.

Proposer queries a majority of Acceptors to see if they have already accepted N; if any have, the proposal is discarded.

Acceptors vote; they unconditionally accept a proposal they have never seen before, and once a majority agrees, the process proceeds.

Learner records the accepted proposal.

Node Failures

If the Proposer fails, another node can be elected as Proposer.

If an Acceptor fails, the algorithm still works as long as a majority can be reached.

Potential Issue – Live‑Lock

If multiple Proposers continuously send proposals before previous ones achieve a majority, Acceptors may keep abandoning current proposals for newer ones, preventing any proposal from being committed.

Multi‑Paxos Algorithm

Improvement over Basic Paxos: the system has a single Proposer, called the Leader.

Steps:

If there is no Leader, the cluster elects one and declares it the current Leader (term M).

Acceptors only vote on proposals issued by the latest Leader.

Other steps are the same as Basic Paxos.

Algorithm Optimization

In Multi‑Paxos the roles are many; for a computer cluster the Proposer, Acceptor, and Learner can be consolidated on a single node, leaving the rest as Acceptors/Learners. This leads to the Raft algorithm.

Raft Algorithm

Paxos is hard to implement; Raft simplifies and improves upon it.

Concept Introduction

Leader: the node that issues proposals.

Follower: nodes that agree with the Leader's proposals.

Candidate: nodes that compete to become Leader.

Raft splits the consistency problem into two sub‑problems: Leader election and state replication .

Leader Election

Each Follower runs a timer.

When the timer expires and no Leader exists, the Follower becomes a Candidate and starts an election, sending vote requests to other nodes.

Other nodes vote for the Candidate.

The Candidate that receives a majority becomes the new Leader (term M).

The Leader continuously sends heartbeats to prove it is alive; Followers reset their timers upon receiving a heartbeat.

If the Leader fails, Followers become Candidates and start a new election.

If two Candidates receive the same number of votes, they wait a random delay before retrying, avoiding a tie.

State Replication

The Leader receives proposals from Clients (unconfirmed proposals are shown in red).

The proposal is included in the Leader's next heartbeat.

Followers reply to the heartbeat.

After receiving replies from a majority of Followers, the Leader commits the proposal, writes it to its storage, and replies to the Client.

The Leader notifies Followers to commit the proposal, ensuring all nodes store identical data.

If a network partition occurs, multiple Leaders may appear; the minority partition cannot achieve consensus (split‑brain).

When the partition heals, only the latest term Leader remains active; older Leaders step down to Followers, restoring consistency.

ZAB Algorithm

ZAB is an improvement over Multi‑Paxos, similar to Raft, with two main differences:

Raft calls the Leader's term "term"; ZAB calls it "epoch".

In Raft, the Leader sends heartbeats to Followers; in ZAB, Followers send heartbeats to the Leader.

Gossip Algorithm

Gossip is a peer‑to‑peer algorithm where every node is equal. Each node propagates data changes to a few other nodes, quickly spreading updates throughout the cluster.

Cluster startup (example with 20 nodes).

A node receives a data change and forwards it to four other nodes.

Those nodes repeat the process until all nodes are updated.

distributed systemsalgorithmconsistencyRaftPaxos
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.