Databases 18 min read

From Single MySQL to Cluster: Master‑Slave Replication, High Availability, and Scaling Strategies

This article explains why growing MySQL workloads require moving from a single instance to a clustered architecture, details the mechanics of master‑slave replication, asynchronous, semi‑synchronous and group replication, and evaluates various high‑availability solutions and read‑write splitting techniques.

JavaEdge
JavaEdge
JavaEdge
From Single MySQL to Cluster: Master‑Slave Replication, High Availability, and Scaling Strategies

Motivation for Moving from a Single MySQL Instance to a Cluster

Increasing data volume, higher read/write concurrency, stricter availability requirements, connection‑limit caps, and consistency challenges expose the limits of a single MySQL server.

Capacity : scaling a single instance is difficult; sharding or database partitioning is required.

Read‑write pressure : high QPS, especially analytical queries, overload a single node; a multi‑node cluster with master‑slave replication distributes the load.

High availability : a single node is a single point of failure; failover tools such as MHA, MySQL Group Replication (MGR), or Orchestrator are needed.

Connection limits : peak traffic can exceed the maximum connections of one server.

Consistency : distributed transactions and flexible X/A transactions become necessary.

Master‑Slave Replication Overview

Master‑slave replication creates multiple identical copies of data. The master writes changes to the binary log (binlog). Each slave runs an I/O thread that copies the binlog to a local relay log, then a SQL thread re‑executes the events from the relay log, keeping the slave data identical to the master.

Historical Milestones

2000 – MySQL 3.23.15 introduced replication.

2002 – MySQL 4.0.2 split I/O and SQL threads and added relay logs.

2010 – MySQL 5.5 added semi‑synchronous replication.

2016 – MySQL 5.7.17 introduced InnoDB Group Replication.

Core Mechanics

Master writes events to the binlog.

Slave stores the received events in a relay log.

Relay log diagram
Relay log diagram

Binlog Formats

ROW – records each row change; most detailed but larger.

STATEMENT – records only the SQL statement; smaller but may cause nondeterministic results.

MIXED – combines ROW and STATEMENT based on safety.

# View a binlog file
mysqlbinlog -vv mysql-bin.000005

Asynchronous Replication

Classic primary‑secondary replication where the master commits transactions and later sends them asynchronously to slaves. Advantages: simple implementation. Disadvantages: network or node failures can cause data inconsistency and replication lag.

MySQL 5.7 introduced parallel replication: multiple SQL threads replay events concurrently, but replay on a given database is still serialized, so high concurrency can still produce lag.

Slave reads the binlog from the master.

Slave writes the events to a relay log and applies them locally.

Since MySQL 5.6, the I/O thread can be multithreaded, improving the speed of binlog transfer.

Semi‑Synchronous Replication

Introduced in MySQL 5.5 (2010). After the master writes a transaction to the binlog, it waits for at least one slave to acknowledge receipt before committing, reducing the risk of data loss.

Master writes the binlog and forces immediate sync to slaves.

Slave writes the events to its relay log and sends an ACK to the master.

Master proceeds only after receiving the ACK from at least one slave.

Semi‑synchronous replication diagram
Semi‑synchronous replication diagram

Group Replication (MySQL 5.7+)

Group replication is built on the Paxos consensus algorithm. Every member holds a full copy of the data (no shared storage). Transactions are atomically broadcast; all members receive them in the same order, guaranteeing strong consistency.

Key features:

Supports single‑primary and multi‑primary modes.

Automatic failover: if the primary fails, a new primary is elected without manual intervention.

Conflict detection and resolution: the first‑ordered transaction commits, later conflicting transactions are aborted.

Group replication protocol diagram
Group replication protocol diagram

Drawbacks of Traditional Master‑Slave and Mitigation Strategies

Replication Lag

Lag occurs because slaves apply changes sequentially. Mitigation approaches include:

Sharding the dataset to reduce per‑master load.

Enabling MySQL parallel replication (multiple SQL threads) and multithreaded I/O.

Refactoring application logic to avoid read‑after‑write patterns that depend on immediate consistency.

Application‑Side Read‑Write Splitting

Typical solutions:

Spring / Spring Boot : configure separate data sources for master and slaves and route reads to slaves.

ShardingSphere‑JDBC : parses SQL and automatically directs reads/writes, eliminating manual routing code.

ShardingSphere‑Proxy / MyCat : deploy a MySQL‑compatible proxy that performs read‑write splitting without code changes.

High‑Availability Solutions

Manual master promotion : when the primary fails, promote a replica and reconfigure clients. Drawbacks: possible data inconsistency and manual effort.

MHA (Master High Availability) : Perl‑based tool that copies binlogs via SSH and switches masters within ~30 seconds. Requires SSH configuration and at least three nodes.

MySQL Group Replication (MGR) : native MySQL solution that automatically elects a new primary, providing strong consistency and fault tolerance.

MySQL InnoDB Cluster : combines Group Replication, MySQL Router (lightweight load‑balancer/failover), and MySQL Shell (management client) for a complete HA stack.

Orchestrator : Go‑based topology manager with automatic discovery, web UI, and hooks for custom failover scripts. Supports both raft‑based and database‑based consistency modes.

Reference

https://dev.mysql.com/doc/refman/5.7/en/group-replication-primary-secondary-replication.html
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

high availabilitymysqlReplicationread/write splittingMHAGroup ReplicationOrchestrator
JavaEdge
Written by

JavaEdge

First‑line development experience at multiple leading tech firms; now a software architect at a Shanghai state‑owned enterprise and founder of Programming Yanxuan. Nearly 300k followers online; expertise in distributed system design, AIGC application development, and quantitative finance investing.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.