Mastering Distributed Consistency: Paxos, Raft, and ZAB Explained
This article examines high‑concurrency distributed consistency algorithms—explaining the CAP challenges, detailing Paxos, Raft, and ZAB’s core concepts, roles, and workflow, and discussing their practical applications and selection criteria for ensuring strong data consistency in critical systems.
This article focuses on distributed consistency algorithms in high‑concurrency scenarios.
This article concentrates on the analysis and discussion of distributed consistency algorithms under high‑concurrency scenarios.
In distributed environments we face three core CAP problems: consistency, availability, and partition tolerance.
Consistency: all nodes see the same data at the same time.
Availability: every request receives a response, successful or not.
Partition tolerance: the system continues to operate despite network partitions.
Ensuring data consistency under high load is crucial for core financial services such as payment, order placement, and inter‑bank transfers, where strong consistency is required to avoid monetary errors.
Distributed consistency algorithms are the key mechanisms that guarantee strong data consistency across multiple nodes.
Commonly used algorithms include:
Paxos
Raft
ZAB (ZooKeeper Atomic Broadcast)
3.1 Paxos Algorithm
Basic Concepts
Proposal: Consists of a proposal ID and a value (the command or log entry to be applied).
Roles:
Proposer – initiates proposals.
Acceptor – votes on proposals.
Learner – learns the chosen value.
Proposer creates a proposal containing an ID and the value to be written.
Acceptor must receive a majority of votes (N/2+1) before a proposal is accepted.
Learner does not participate in voting; it learns the accepted value after consensus.
Algorithm Flow
Prepare Phase: Proposer sends a Prepare request with a unique, increasing proposal number N to all Acceptors.
Promise Phase: Acceptors respond if the proposal number is higher than any previously seen, promising not to accept lower numbers.
Acknowledge Phase: Acceptors confirm they have accepted the proposal.
Decision Phase: Once a majority of Acceptors accept a proposal, it becomes the decision and is applied by all nodes.
Learn Phase: Learners retrieve the chosen value from Acceptors.
If the number of successful responses exceeds half of the Acceptors, the value is committed; otherwise the proposer retries with a higher proposal number.
Applications
Paxos is highly fault‑tolerant and is used in systems such as Zookeeper (via Multi‑Paxos) and Google’s distributed lock service.
3.2 Raft Algorithm
Basic Concepts
Raft solves distributed consistency by providing a clear approach to leader election, log replication, and safety.
Leader Election and Timeouts
Servers can be in three states: Leader, Follower, or Candidate. Followers start election timers; if a timer expires without hearing from a Leader, the Follower becomes a Candidate and requests votes. A candidate that receives a majority becomes the Leader.
Roles:
Leader – handles client requests, replicates logs, sends heartbeats.
Follower – passive, receives heartbeats and logs from the Leader.
Candidate – seeks election when no Leader is known.
3.3 ZAB (ZooKeeper Atomic Broadcast) Algorithm
Basic Concepts
ZAB is the atomic broadcast protocol used by ZooKeeper to guarantee data consistency. It adapts ideas from Paxos but is tailored for ZooKeeper’s leader‑follower architecture, supporting crash recovery.
Broadcast Process
Clients send write requests to the Leader, which packages them into a proposal and broadcasts to Followers. If a majority of Followers acknowledge, the Leader commits the transaction and notifies all Followers.
Summary
Distributed consistency algorithms ensure that multiple nodes produce the same result when reading or modifying shared data, which is essential for the reliability of distributed systems. The most common algorithms are Paxos, Raft, and ZAB, each with distinct characteristics and suitable scenarios. Selecting the appropriate algorithm depends on factors such as system scale, node count, communication overhead, consistency requirements, and fault tolerance.
Paxos: Message‑based consensus algorithm suitable for a wide range of distributed systems.
Raft: Easier‑to‑understand consensus algorithm that separates concerns into leader election, log replication, and safety.
ZAB: ZooKeeper‑specific atomic broadcast protocol designed for crash recovery and strong consistency.
Other algorithms like the Gossip protocol also exist for specific use cases.
Architecture & Thinking
🍭 Frontline tech director and chief architect at top-tier companies 🥝 Years of deep experience in internet, e‑commerce, social, and finance sectors 🌾 Committed to publishing high‑quality articles covering core technologies of leading internet firms, application architecture, and AI breakthroughs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.