Fundamentals 22 min read

How X‑Paxos Transforms Distributed Consensus for High‑Performance Databases

X‑Paxos is Alibaba’s high‑performance, independently designed Paxos library that extends the classic consensus algorithm with multi‑threaded architecture, pluggable logging, adaptive batching and pipelining, and flexible node roles, delivering strong consistency, high availability, and low latency for global distributed databases and services.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How X‑Paxos Transforms Distributed Consensus for High‑Performance Databases

Paxos is a foundational distributed consensus algorithm, widely regarded as the de‑facto protocol for achieving strong consistency and high availability in distributed systems. X‑Paxos is Alibaba’s independent implementation of a high‑performance Paxos library, built to meet the demands of global deployment, high throughput, and the specific characteristics of Alibaba’s services.

Background – While Paxos has been studied for over 17 years, mature open‑source independent libraries remain scarce. Existing solutions such as Google’s internal implementations, Facebook’s undisclosed systems, and Apache Zookeeper either lack high‑throughput state‑machine replication or do not provide a standalone library for rapid integration.

Vision – X‑Paxos aims to provide a production‑tested, highly reliable independent Paxos library that can be easily integrated into backend services to obtain strong consistency, high availability, and automatic disaster recovery, making the traditionally complex Paxos algorithm approachable for a wide range of applications.

Architecture

The overall architecture consists of four layers: network layer, service layer, algorithm module, and log module.

Network Layer – Built on Alibaba’s mature libeasy library, providing asynchronous networking and a customized reconnection mechanism suitable for distributed protocols.

Service Layer – A C++11‑based multithreaded asynchronous framework that offers event‑driven execution, timer callbacks, and a flexible worker model, eliminating the CPU bottleneck of single‑threaded designs.

Algorithm Module – Implements a unique‑proposer multi‑Paxos design, offering better performance than basic Paxos and supporting extensive functional and performance enhancements tailored to Alibaba’s workloads.

Log Module – Decoupled from the algorithm to allow pluggable high‑performance logging implementations; users can integrate existing WAL systems to avoid redundant storage and improve throughput.

Feature Enhancements

Online node addition/removal and leader transfer.

Strategy‑based majority and weighted leader election, enabling user‑defined rules for disaster recovery.

Customizable node roles (Proposer/Accepter/Learner) allowing trimmed‑down nodes for specific use cases.

Witness SDK that abstracts the Learner role as a data‑stream subscriber, facilitating downstream log consumption, backup, and configuration push.

Performance Optimizations

Adaptive batching and pipelining to maximize throughput over high‑latency networks, with the relationship M/R * P = D guiding optimal batch size (M) and pipeline depth (P) based on bandwidth (R) and propagation delay (D).

Multi‑threaded implementation that removes the single‑thread limitation of many Paxos libraries, achieving significantly higher per‑partition performance.

Locality‑aware content distribution that reduces load on the primary node and minimizes cross‑region bandwidth usage.

new ThreadTimer(srv_->getThreadTimerService(), srv_, electionTimeout_, ThreadTimer::Oneshot, &Paxos::checkLeaderTransfer, this, targetId, currentTerm_.load(), log_->getLastLogIndex());

Correctness Verification

Integration with Jepsen to validate behavior under network partitions and failures.

Formal modeling with TLA+ to prove safety properties.

Automated random fault injection system and regression test suite for continuous reliability checks.

Competitor Analysis

Compared with XCOM (MySQL Group Replication) and phxpaxos, X‑Paxos demonstrates superior performance: over 100× higher throughput within a region and only a 3.5% throughput drop in cross‑region scenarios, whereas phxpaxos struggles under high latency.

Current Status and Future Work

X‑Paxos Phase 1 is already deployed in Alibaba’s AliSQL X‑Cluster and other internal services. Future directions include multi‑partition support with deep shared asynchronous frameworks, strong consistent reads across multiple nodes, and continued performance tuning for global deployments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Alibabahigh performancemulti-threadingPaxosdistributed consensuspluggable logging
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.