Databases 9 min read

MySQL High Availability Solutions: Selection Guide

This article reviews various MySQL high‑availability architectures—including master‑slave replication with keepalived or MHA, Galera‑based clusters, and other approaches—detailing their deployment methods, advantages, limitations, and practical considerations for selecting an appropriate HA solution.

Architect
Architect
Architect
MySQL High Availability Solutions: Selection Guide

Available MySQL High Availability Solutions

MySQL high‑availability (HA) can be built on several foundations: master‑slave replication, Galera protocol, NDB engine, middleware/proxy, shared storage, and host HA.

HA based on master‑slave replication

Two‑node master‑slave + keepalived/heartbeat

For small‑to‑medium deployments, a simple one‑master‑one‑slave or dual‑master setup within the same VLAN can use keepalived or heartbeat to switch quickly when the master fails. Important considerations include setting both nodes to BACKUP mode in keepalived, configuring different auto_increment_increment and auto_increment_offset values, ensuring the slave hardware is not inferior, using MariaDB or MySQL 5.7 multi‑threaded replication to reduce lag, optionally using semi‑sync replication or PXC for near‑zero lag, enhancing keepalived health checks beyond process and port monitoring, handling split‑brain scenarios with additional scripts, and evaluating slave lag before failover.

Diagram of the two‑node architecture is shown below:

Multi‑node master‑slave + MHA/MMM

Multi‑node clusters can use one‑master‑many‑slaves or dual‑master‑many‑slaves, managed by MHA (or MMM). MHA is open‑source, Perl‑based, mature, provides strict failover checks, supports a binlog server for higher transfer efficiency, but requires SSH trust between nodes and may need script customization.

Multi‑node master‑slave + etcd/zookeeper

In large‑scale environments, keepalived or MHA become cumbersome; a configuration service such as etcd or Zookeeper can centrally manage the cluster, simplify detection and failover, and avoid chaotic manual operations.

HA based on Galera protocol

Galera provides multi‑master synchronous replication. The common implementations are MariaDB Galera Cluster and Percona XtraDB Cluster (PXC). PXC offers near‑zero replication lag, automatic node provisioning, strict data consistency, and full MySQL compatibility, but it only works with InnoDB, requires primary keys, does not support explicit table locks or XA, may suffer from lock conflicts, and its throughput depends on the slowest node; adding new nodes incurs high SST cost, and large‑scale deployments benefit from InfiniBand networking.

Other HA options

Based on NDB Cluster – not recommended for production due to many limitations.

Based on shared storage – requires high‑performance storage and can become a single point of failure unless using distributed storage.

Based on middleware/proxy – few reliable proxies exist; often need custom development.

Based on host HA – builds a high‑availability OS cluster (e.g., RHCS) before deploying MySQL, but is rarely used in practice.

Readers are invited to suggest topics for future sharing.

DatabaseHigh AvailabilityMySQLReplicationGaleraMHA
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.