Mastering Database High Availability: From Basic Replication to Seamless Scaling
This article examines the evolution of database high‑availability architectures in large‑scale internet environments, covering basic direct‑connect setups, scale‑up/scale‑out sharding, master‑slave/master‑master with Keepalived, and advanced solutions such as MHA, Percona XtraDB Cluster, and MySQL Group Replication, plus smooth scaling steps.
1 Background
In large‑scale internet scenarios, database high availability is crucial; reinforced architecture patterns are needed to ensure the data layer provides continuous stable support.
2 Evolution of High‑Availability Architectures
2.1 Basic Database Architecture
Each service typically connects directly to a single database instance via IP+Port – the classic direct‑connect model.
Service instances point to the database address for access.
2.2 Scale‑Up + Scale‑Out
As traffic and data grow, databases are scaled vertically (Scale‑Up) and horizontally (Scale‑Out). After sharding, data may reside in different instances or data centers, reducing load and improving performance. See the referenced article for details.
Sharding routes queries to different IPs based on conditions such as user role, business type, or hash/modulo values (e.g., value % 2 == 0 → condition1, value % 2 == 1 → condition2).
2.3 Master‑Slave or Master‑Master + Keepalived
These patterns address capacity but not availability. Using dual masters or master‑slave replication together with Keepalived and a virtual IP ensures that if one instance fails, the VIP moves to the other node transparently.
2.4 High Availability in Sharding Scenarios
Extending the previous design, sharded environments can also achieve high availability; further scaling is possible by adding more shards.
2.5 Other Common HA Solutions
2.5.1 MHA
MHA (Master High Availability) monitors master failures and promotes the most up‑to‑date slave to master, then re‑points other slaves.
2.5.2 PXC
Percona XtraDB Cluster integrates Percona Server, XtraBackup and Galera for multi‑master MySQL clustering. It only supports InnoDB and incurs performance overhead, latency, and risk of data divergence; throughput is limited by the slowest node.
2.5.3 MGR / InnoDB Cluster
MySQL Group Replication (MGR) provides multi‑node writes with strong consistency, using the GCS protocol for atomic messaging. Combined with InnoDB Cluster, it delivers true high availability.
2.6 Smooth Scaling under HA
In high‑traffic scenarios, storage layers become bottlenecks; loss‑less, transparent scaling is desired.
Steps:
Add a new shard, initialize architecture and sync data.
Update master‑slave configuration to include the new shard, mapping old to new.
Reload service configuration.
Remove redundant data and optionally shrink old shards.
Service routes requests to appropriate instances based on conditions.
3 Summary
Database high availability can be achieved through basic master‑slave/master‑master + Keepalived, as well as solutions like MHA, Percona XtraDB Cluster, and MySQL Group Replication. Proper HA enables seamless migration, scaling, and business adjustments without service disruption.
Architecture & Thinking
🍭 Frontline tech director and chief architect at top-tier companies 🥝 Years of deep experience in internet, e‑commerce, social, and finance sectors 🌾 Committed to publishing high‑quality articles covering core technologies of leading internet firms, application architecture, and AI breakthroughs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.