Databases 9 min read

Can WeChat’s Open‑Source PhxSQL Deliver True High‑Availability MySQL?

The article reviews WeChat’s newly open‑sourced PhxSQL, a MySQL clustering solution that promises high availability and strong consistency, examines its complex architecture with added Phxbinlogsvr and Phxsqlproxy components, and discusses practical concerns such as deployment difficulty, replication latency, lack of multi‑master writes, semi‑synchronous behavior, and monitoring challenges.

Efficient Ops
Efficient Ops
Efficient Ops
Can WeChat’s Open‑Source PhxSQL Deliver True High‑Availability MySQL?

Editor's Note

When the editor saw “[Breaking] WeChat open‑source PhxSQL: High‑availability, strong‑consistent MySQL cluster”, he was genuinely impressed; this contribution benefits DBAs and operations engineers, and the openness of Tencent and WeChat is more than just words.

The article is written by an experienced DB professional, focusing on technical logic and encouraging discussion. He wishes PhxSQL continued success and hopes the world sees the power of Chinese open‑source.

Lead

WeChat recently announced on its Moments a high‑availability, strongly consistent MySQL cluster solution, which quickly went viral. Some people bookmarked it, others compared it to Galera or group replication, and many praised the open‑source effort.

The author reflects on why the announcement is so popular, citing WeChat’s massive influence and the pressing need for reliable MySQL HA solutions that surpass Galera.

Why the buzz?

WeChat’s open‑source move itself carries huge weight in China.

The MySQL solution addresses a critical demand for robust high‑availability and consistency, where few alternatives exist.

Complex architecture and deployment overhead

Figure 1 (illustrated below) shows that PhxSQL adds two new services—Phxbinlogsvr and Phxsqlproxy—to a traditional three‑node MySQL cluster, increasing module count and consequently the failure rate, hardware pressure, and operational complexity.

More modules mean higher cluster fault probability and lower reliability; how long does it take to deploy such a cluster?

The added components raise the entry barrier, increase machine load and cost, and give DBAs more objects to manage.

Fundamental reliance on master‑slave replication

Although some compare PhxSQL to Galera or group replication, it remains fundamentally a master‑slave architecture. The original binlog is now replicated from the master to Phxbinlogsvr, and MySQL slaves pull from Phxbinlogsvr.

This extra layer introduces additional replication latency, raising questions about seamless failover and consistency during master‑slave lag.

In this situation one must sacrifice either consistency or availability, making it hard to achieve Galera‑like dual guarantees.

Single‑point writes, no multi‑master capability

Unlike Galera or group replication, PhxSQL does not support true multi‑master writes. The “multi‑point write” shown in the architecture merely forwards writes to the current master via Phxsqlproxy, requiring manual promotion of a new master and potentially impacting business.

Semi‑synchronous behavior with Paxos

Figure 2.1 (standard semi‑sync) and Figure 2.2 (PhxSQL) illustrate that PhxSQL behaves like a semi‑synchronous system, waiting for ACKs from Phxbinlogsvr nodes. With three nodes, it must receive ACKs from a majority before proceeding.

Thus its performance is comparable to a three‑node semi‑sync setup.

PhxSQL also claims that replica nodes never roll back, which the author supports; the master may still roll back, but slaves only need to catch up.

The author questions the cost‑benefit ratio of using such a complex architecture merely to avoid replica rollbacks.

Parallel binlog application after Phxbinlogsvr

In MySQL 5.6/5.7 parallel replication is possible. The article asks how PhxSQL enables parallel application of binlogs from Phxbinlogsvr to MySQL—whether it treats Phxbinlogsvr as a primary source using native replication or implements a custom solution.

Monitoring responsibilities

Given the complexity, each module’s health must be monitored. The author wonders whether a public monitoring system exists or if modules monitor each other, emphasizing that monitoring is essential for cluster robustness.

Comparison with a middle‑layer plus semi‑sync approach

If one were to forgo the rollback‑free guarantee, how does PhxSQL compare to a simpler architecture that adds a middle layer with semi‑synchronous replication? The article seeks to identify any advantages.

high availabilityMySQLOpen SourceDatabase ClusteringPhxSQL
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.