Can Joint Consensus Member Changes Be Simplified to a Single Step?
This article examines the challenges of Raft’s two‑stage Joint Consensus member changes, explores single‑step alternatives, analyzes ZooKeeper’s approach, and proposes improvements that combine safety with reduced log overhead, offering practical insights for distributed system engineers seeking more efficient reconfiguration methods.
1 Introduction
In distributed systems, node failures are common, requiring dynamic addition, removal, and replacement of nodes. Member changes are essential for maintaining availability, especially in consistency protocols.
Raft’s two‑stage Joint Consensus is the industry‑standard for member changes, but it requires proposing two log entries per change, which can be inconvenient. Although Raft also defines a single‑step method, it only adds or removes one member at a time and is generally discouraged.
This article investigates whether Joint Consensus can be implemented using a single step.
2 Member Changes
Member changes modify the set of nodes participating in a consensus protocol (adding, removing, or replacing nodes) without affecting system availability.
They are a consistency problem because all nodes must agree on the new configuration, yet the voting set itself changes during the process.
Figure 1: At a certain moment, the old configuration Cold and the new configuration Cnew may each contain a disjoint majority, creating a dual‑quorum and breaking consistency.
Raft solves this with Joint Consensus, introducing a joint configuration Cold,new that overlaps with both Cold and Cnew, ensuring intersecting quorums.
1 Joint‑Consensus Member Change
Joint Consensus uses a joint configuration Cold,new as a transition between Cold and Cnew. First the system switches from Cold to Cold,new; after Cold,new is committed, it switches to Cnew, preventing simultaneous use of Cold and Cnew and thus avoiding dual‑quorum.
Figure 2: Relationship among the quorum sets of Cold, Cold,new, and Cnew.
Joint Consensus requires two log entries: first a Cold,new entry that must be confirmed by both Cold and Cnew, then a Cnew entry that only needs confirmation from Cnew. After the Cnew entry is committed, the change is complete and nodes not in Cnew are removed.
Figure 3: Joint Consensus member‑change process.
If a failover occurs during the change, the new leader may or may not have the Cold,new entry. If it lacks the entry, the change rolls back to Cold; if it has the entry, the process continues.
2 Single‑Step Member Change
Joint Consensus needs two phases because it makes no assumptions about the overlap between Cold and Cnew, avoiding dual‑quorum. If we can guarantee that Cold and Cnew always share at least one node, the change can be reduced to a single phase.
Restricting each change to adding or removing exactly one member ensures such an overlap. This restriction can be proven mathematically: with only one member added or removed, Cold and Cnew cannot form disjoint majorities.
Figure 4: Quorum relationship when a single member is added or removed.
Thus, by allowing only one‑member changes, we can transition directly from Cold to Cnew without a joint configuration, achieving a single‑step change. Multiple single‑step changes can be composed to replace several members.
3 Single‑Step Implementation of Two‑Stage Member Change
The two‑stage Joint Consensus is general but requires two log entries; the single‑step method is simpler but limited to one‑member changes. This section explores whether the two‑stage process can be realized in a single step.
After the Cold,new entry is committed, all nodes agree on the new configuration, making the subsequent Cnew entry’s purpose—informing nodes to switch to Cnew and taking nodes out of the old configuration—potentially redundant. If the Cnew entry is made asynchronous, the change can be considered complete after Cold,new commits.
1 ZooKeeper Member Change
Since version 3.5.0, ZooKeeper supports dynamic reconfiguration on top of Zab. Because ZooKeeper must preserve the Primary Order property, it cannot use Raft’s two‑log Joint Consensus directly. Instead, ZooKeeper’s protocol (described in “Dynamic Reconfiguration of Primary/Backup Clusters”) introduces a COP log carrying both old (S) and new (S′) configurations, followed by an ACTIVATE message.
Figure 5: ZooKeeper member‑change protocol.
During reconfiguration, new nodes first sync state from the leader without voting, then the leader broadcasts the COP log to all nodes (old and new). Once a majority of the old configuration commits the COP log, consensus on the new configuration is reached. After the ACTIVATE message, logs are committed only by the new configuration, and a no‑op log ensures that any new leader resides in the new configuration.
2 Improved Single‑Step Implementation
ZooKeeper’s approach mirrors the role of the Cnew log in Joint Consensus. By making the Cnew log asynchronous—considering the change complete after Cold,new commits—we can simplify the process. Additionally, retaining the ACTIVATE message but omitting the no‑op log is possible if election priority is given to nodes with the latest configuration, ensuring safe leader selection.
4 Summary
Joint Consensus greatly advanced practical member‑change engineering with its elegant, generic two‑stage design, but it requires two log entries per change. This article explored ways to achieve a single‑step implementation of Joint Consensus, presented improvements, and offered additional options for engineers.
5 Thoughts
Why does the Cnew log inevitably get committed after the Cold,new log?
After an ACTIVATE message switches nodes to the new configuration, how is the new configuration preserved across node restarts?
Are there other methods to achieve a single‑step implementation of two‑stage member changes?
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
