Databases 17 min read

How PhxSQL Achieves Strong Consistency and High Availability for MySQL

This article explains the design and implementation of PhxSQL, a MySQL‑compatible high‑availability solution that uses a reliable log storage based on Paxos, Proxy request forwarding, automatic master election, and other mechanisms to overcome native MySQL replication flaws and provide strong data consistency and fault‑tolerant performance.

WeChat Backend Team
WeChat Backend Team
WeChat Backend Team
How PhxSQL Achieves Strong Consistency and High Availability for MySQL

Design Background

Internet applications, especially account and financial systems, require strong consistency and high availability. Traditional MySQL master‑slave setups cannot guarantee both when machines fail, networks partition, or manual/automatic failover occurs. PhxSQL builds a MySQL cluster on top of a robust Paxos‑based log, ensuring data consistency across MySQL instances and overall cluster high availability.

Native MySQL Disaster‑Recovery Defects

MySQL Replication Schemes

MySQL provides asynchronous and semi‑synchronous replication. In asynchronous mode, the master commits locally and replicates to the slave later, which may lead to data loss if the replication fails (see Figure 1). Semi‑synchronous replication waits for the slave to acknowledge before committing, improving consistency (see Figure 2), but still has shortcomings during master restarts and failovers.

Master Restart Issues

When a master restarts, pending binlog entries (written to the binlog file but not yet replicated) may be committed directly, causing divergence between old and new masters (Figure 3). This can produce data inconsistency, phantom reads for clients, and split‑brain scenarios (Figures 4‑6). MySQL also lacks an automatic master election mechanism (Figure 7).

PhxSQL Design Idea

Reliable Log Storage

PhxSQL introduces a reliable log storage cluster (BinlogSvr) based on Paxos. The master sends its binlog to BinlogSvr; slaves pull binlog from BinlogSvr for replication. During master restart, BinlogSvr is consulted to decide whether a pending binlog should be kept or discarded, guaranteeing consistency (Figure 8).

Request Forwarding

A proxy layer (PhxSQLProxy) sits between clients and MySQL. It forwards client requests to the current master, preventing client split‑brain during master switches. Two forwarding modes are supported: read/write port forwarding and read‑only port forwarding (Figure 12). The proxy uses a coroutine model (Libco) for high performance and maintains a 1:1 connection model to preserve MySQL transaction semantics (Figure 13). It also forwards the real client IP via a reserved MySQL protocol field to keep permission checks correct (Figure 14).

Automatic Master Election

Each node runs an Agent that monitors MySQL health. Healthy masters periodically renew a lease in the reliable store; non‑masters check the lease and, if expired, initiate a Paxos‑based election to become the new master (Figure 10).

PhxSQL Architecture and Implementation

Each node hosts three components: PhxSQLProxy, MySQL, and PhxBinlogSvr. All PhxBinlogSvr instances form a reliable log and master‑info store, also acting as the Agent. PhxSync, analogous to MySQL’s semi‑sync plugin, commits binlog entries to BinlogSvr and calibrates binlog state on restart (Figure 9).

PhxBinlogSvr

BinlogSvr stores binlog data and master information, achieving consensus via the open‑source PhxPaxos library. It supports MySQL’s native replication protocol, rejects writes from non‑master nodes, and uses optimistic locking to prevent erroneous master submissions (Figures 15‑16). It also provides automatic master election through Paxos (Figure 17).

PhxSQL Effects

Data Consistency

Comparisons of binlog, Paxos state, and BinlogSvr data across three nodes show full consistency (Figure 18).

Master Automatic Switch

During a master failure, traffic shifts smoothly to the new master, confirming successful failover (Figure 19).

Performance

Benchmarks using sysbench on Percona 5.6.31‑77.0 demonstrate that PhxSQL’s write performance exceeds MySQL semi‑sync, while read performance is slightly lower due to the proxy layer. Overall, PhxSQL delivers strong consistency, high availability, and competitive performance (Figure 20).

MySQL asynchronous replication flow
MySQL asynchronous replication flow
MySQL semi‑synchronous replication flow
MySQL semi‑synchronous replication flow
MySQL restart pending binlog
MySQL restart pending binlog
MySQL restart inconsistency
MySQL restart inconsistency
MySQL restart phantom read
MySQL restart phantom read
MySQL client split
MySQL client split
MySQL missing automatic master election
MySQL missing automatic master election
PhxSQL basic architecture
PhxSQL basic architecture
Proxy request forwarding flow
Proxy request forwarding flow
Proxy 1:1 transaction connection model
Proxy 1:1 transaction connection model
Proxy IP forwarding for permissions
Proxy IP forwarding for permissions
BinlogSvr reject non‑master submissions
BinlogSvr reject non‑master submissions
BinlogSvr optimistic lock
BinlogSvr optimistic lock
BinlogSvr native MySQL replication protocol
BinlogSvr native MySQL replication protocol
BinlogSvr master info
BinlogSvr master info
PhxSQL data consistency comparison
PhxSQL data consistency comparison
Master failover traffic shift
Master failover traffic shift
PhxSQL vs MySQL performance
PhxSQL vs MySQL performance
Performance comparison chart
Performance comparison chart
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

distributed systemsProxyhigh availabilityMySQLDatabase ReplicationPaxos
WeChat Backend Team
Written by

WeChat Backend Team

Official account of the WeChat backend development team, sharing their experience in large-scale distributed system development.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.