How MySQL Master‑Slave Replication and Read‑Write Splitting Turn a Single Server into a High‑Availability Architecture
This article walks through why a single MySQL instance often fails under load, explains the fundamentals of asynchronous master‑slave replication and read‑write splitting, provides step‑by‑step configuration scripts, highlights common pitfalls with solutions, and shows advanced optimization and monitoring techniques for building a scalable, high‑availability MySQL architecture.
How MySQL Master‑Slave Replication and Read‑Write Splitting Turn a Single Server into a High‑Availability Architecture
Introduction: A costly outage
At 3 am the monitoring alarm rang: MySQL master CPU hit 100 % and connections were exhausted, causing a complete service outage for a million‑daily‑active‑user e‑commerce platform. The failure could have been avoided with master‑slave replication and read‑write separation.
1. Why does MySQL often break?
1.1 The fatal bottlenecks of a single MySQL instance
Single‑MySQL performance ceiling:
- QPS limit: ~3000‑5000 on a normal config
- Connection limit: default 151, a few thousand after tuning
- Single point of failure: crash = service down
- Scaling difficulty: vertical scaling cost grows exponentiallyReal‑world case: a social app grew from 100 k to 1 M users in three months, and MySQL degraded from "steady as a rock" to "slow as a snail", eventually crashing on a Friday night.
1.2 The harsh truth about read‑write ratios
Internet app read‑write ratios:
- E‑commerce: 8:2
- Social: 9:1
- Content platform: 95:5
- Finance: 7:3Over 80 % of database load comes from reads, yet a single MySQL instance handles both reads and writes, wasting resources.
2. Master‑Slave Replication: The art of data "cloning"
2.1 Core principle of replication
MySQL master‑slave replication is an asynchronous data sync mechanism . All changes on the master are recorded in the binary log and replayed on the slave.
Master (Master) Slave (Slave)
│ │
├──write data──► │
│ ↓ │
│ Binary Log │
│ │ │
│ └────Transmission──► Relay Log
│ │
│ ├── SQL Thread──► Execute SQL
│ │
└───────────────────────────────┴─ Data consistency2.2 Step‑by‑step setup
Step 1: Master configuration (my.cnf)
# /etc/my.cnf
[mysqld]
server-id = 1 # unique ID
log-bin = mysql-bin # enable binary log
binlog-format = ROW # recommended format
binlog-do-db = your_database
expire_logs_days = 7
slave-skip-errors = 1062,1053Step 2: Create replication user
-- On master
CREATE USER 'repl'@'%' IDENTIFIED BY 'your_password';
GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%';
FLUSH PRIVILEGES;
SHOW MASTER STATUS\GStep 3: Slave configuration (my.cnf)
# /etc/my.cnf
[mysqld]
server-id = 2
log-bin = mysql-bin
relay-log = mysql-relay-bin
read_only = 1
super_read_only = 1Step 4: Start replication
-- On slave
CHANGE MASTER TO MASTER_HOST='master_ip',
MASTER_PORT=3306,
MASTER_USER='repl',
MASTER_PASSWORD='your_password',
MASTER_LOG_FILE='mysql-bin.000001',
MASTER_LOG_POS=154;
START SLAVE;
SHOW SLAVE STATUS\G2.3 Common replication pitfalls and fixes
Pitfall 1: Replication lag
Problem: data written to master appears on the slave after several seconds.
# Enable semi‑sync on master
INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
SET GLOBAL rpl_semi_sync_master_enabled = 1;
# Enable semi‑sync on slave
INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
SET GLOBAL rpl_semi_sync_slave_enabled = 1;
# Optimize slave parallelism
[mysqld]
slave_parallel_workers = 4
slave_parallel_type = LOGICAL_CLOCK
innodb_flush_log_at_trx_commit = 2
sync_binlog = 0Pitfall 2: Data inconsistency
Problem: slave data diverges from master.
# Check consistency with pt‑table‑checksum
pt-table-checksum \
--host=master_ip \
--user=root \
--password=xxx \
--databases=your_db \
--recursion-method=processlist
# Fix with pt‑table‑sync
pt-table-sync \
--execute \
--sync-to-master \
--host=slave_ip \
--user=root \
--password=xxx3. Read‑Write Separation: Boosting performance
3.1 Architecture design
Application Layer
│
▼
┌───────────────┐
│ Proxy / Middleware │
└───────────────┘
│
┌───────┴───────┐
│ │
Write request Read request
│ │
┌─────┐ ┌───────────────┐
│Master│ │ Slave Pool │
└─────┘ │ ┌─────┐ ┌─────┐ │
│ │Slave1│ │Slave2│ │
│ └─────┘ └─────┘ │
└─────────────────┘3.2 ProxySQL implementation (production‑grade)
Install ProxySQL
# CentOS/RHEL
yum install -y proxysql
systemctl start proxysql
systemctl enable proxysqlConfigure ProxySQL
-- Connect to ProxySQL admin interface
mysql -uadmin -padmin -h127.0.0.1 -P6032
-- Add MySQL servers
INSERT INTO mysql_servers(hostgroup_id,hostname,port,weight) VALUES
(1,'192.168.1.10',3306,1000), -- master (write)
(2,'192.168.1.11',3306,900), -- slave 1 (read)
(2,'192.168.1.12',3306,900); -- slave 2 (read)
-- Add monitoring account
INSERT INTO mysql_users(username,password,default_hostgroup) VALUES
('proxysql_monitor','monitor_pass',1);
-- Add application account
INSERT INTO mysql_users(username,password,default_hostgroup,transaction_persistent) VALUES
('app_user','app_pass',1,1);
-- Query routing rules
INSERT INTO mysql_query_rules(rule_id,match_pattern,destination_hostgroup,apply) VALUES
(1,'^SELECT.*',2,1), -- SELECT → read group
(2,'^SELECT.*FOR UPDATE',1,1), -- SELECT … FOR UPDATE → write group
(3,'^INSERT|UPDATE|DELETE|REPLACE',1,1);
-- Load and persist configuration
LOAD MYSQL SERVERS TO RUNTIME;
LOAD MYSQL USERS TO RUNTIME;
LOAD MYSQL QUERY RULES TO RUNTIME;
SAVE MYSQL SERVERS TO DISK;
SAVE MYSQL USERS TO DISK;
SAVE MYSQL QUERY RULES TO DISK;3.3 Application‑level read‑write separation (Spring Boot + MyBatis)
// Data source configuration
@Configuration
public class DataSourceConfig {
@Bean @ConfigurationProperties("spring.datasource.master")
public DataSource masterDataSource() { return DataSourceBuilder.create().build(); }
@Bean @ConfigurationProperties("spring.datasource.slave")
public DataSource slaveDataSource() { return DataSourceBuilder.create().build(); }
@Bean
public DataSource routingDataSource(@Qualifier("masterDataSource") DataSource master,
@Qualifier("slaveDataSource") DataSource slave) {
Map<Object,Object> ds = new HashMap<>();
ds.put(DataSourceType.MASTER, master);
ds.put(DataSourceType.SLAVE, slave);
RoutingDataSource routing = new RoutingDataSource();
routing.setDefaultTargetDataSource(master);
routing.setTargetDataSources(ds);
return routing;
}
}
// Annotation for read‑only methods
@Target({ElementType.METHOD, ElementType.TYPE})
@Retention(RetentionPolicy.RUNTIME)
public @interface ReadOnly {}
// AOP to switch datasource
@Aspect @Component
public class DataSourceAspect {
@Around("@annotation(readOnly)")
public Object setReadDataSource(ProceedingJoinPoint p, ReadOnly readOnly) throws Throwable {
DataSourceContextHolder.setDataSourceType(DataSourceType.SLAVE);
try { return p.proceed(); } finally { DataSourceContextHolder.clearDataSourceType(); }
}
}
// Service example
@Service
public class UserService {
@Autowired private UserMapper userMapper;
public void createUser(User u) { userMapper.insert(u); } // writes go to master
@ReadOnly public User getUser(Long id) { return userMapper.selectById(id); } // reads go to slave
}4. Advanced optimizations: Squeezing every drop of performance
4.1 Delayed‑read strategy
For scenarios tolerant of latency, read directly from a slave.
@Service
public class OrderService {
// Real‑time: read from master
public Order getOrderForPayment(Long id) {
DataSourceContextHolder.setDataSourceType(DataSourceType.MASTER);
return orderMapper.selectById(id);
}
// Tolerant: read from slave
@ReadOnly
public List<Order> getOrderHistory(Long userId) {
return orderMapper.selectByUserId(userId);
}
}4.2 Smart routing based on slave lag
@Component
public class SmartRoutingStrategy {
@Autowired private SlaveMonitor slaveMonitor;
public DataSource selectDataSource(boolean allowDelay) {
if (!allowDelay) return masterDataSource;
SlaveInfo best = slaveMonitor.getSlaves().stream()
.filter(s -> s.getDelaySeconds() < 3)
.min(Comparator.comparing(SlaveInfo::getDelaySeconds))
.orElse(null);
return best != null ? best.getDataSource() : masterDataSource;
}
}4.3 Cache‑assisted reads
@Service
public class CachedUserService {
@Autowired private RedisTemplate<String,User> redisTemplate;
@ReadOnly @Cacheable(value="users",key="#id")
public User getUser(Long id) {
// 1. Try Redis cache
// 2. Miss → read from slave
// 3. Populate cache
return userMapper.selectById(id);
}
@CacheEvict(value="users",key="#user.id")
public void updateUser(User user) {
userMapper.update(user); // write to master
}
}5. Monitoring and Operations: Ensuring stability
5.1 Key replication metrics
Master‑Slave Monitoring:
- Seconds_Behind_Master (replication lag)
- Slave_IO_Running (IO thread status)
- Slave_SQL_Running (SQL thread status)
- Last_Error (last error message)
Performance Monitoring:
- QPS/TPS
- Connections
- Slow_queries
- Buffer_pool_hit_rate5.2 Automated monitoring script (bash)
#!/bin/bash
MYSQL_USER="monitor"
MYSQL_PASS="your_password"
SLAVE_IPS=("192.168.1.11" "192.168.1.12")
for ip in "${SLAVE_IPS[@]}"; do
status=$(mysql -h$ip -u$MYSQL_USER -p$MYSQL_PASS -e "SHOW SLAVE STATUS\G" 2>/dev/null)
io=$(echo "$status" | grep "Slave_IO_Running" | awk '{print $2}')
sql=$(echo "$status" | grep "Slave_SQL_Running" | awk '{print $2}')
delay=$(echo "$status" | grep "Seconds_Behind_Master" | awk '{print $2}')
if [ "$io" != "Yes" ] || [ "$sql" != "Yes" ]; then
echo "CRITICAL: Slave $ip replication broken!"
elif [ "$delay" -gt 10 ]; then
echo "WARNING: Slave $ip delay ${delay}s"
fi
done5.3 Automatic failover with MHA
# MHA configuration
[server default]
manager_workdir=/var/log/mha
manager_log=/var/log/mha/manager.log
user=mha
password=mha_password
ssh_user=root
repl_user=repl
repl_password=repl_password
[server1]
hostname=192.168.1.10
candidate_master=1
[server2]
hostname=192.168.1.11
candidate_master=1
[server3]
hostname=192.168.1.12
no_master=16. Real‑world evolution roadmap
Stage 1 – Single‑node (QPS < 1000): optimize indexes, SQL, config.
Stage 2 – Master‑slave (QPS 1000‑5000): add ProxySQL for read‑write split, cache hot data in Redis.
Stage 3 – One master, multiple slaves (QPS 5000‑20000): intelligent routing, prepare sharding.
Stage 4 – Distributed clusters (QPS > 20000): multiple master‑slave groups, ShardingSphere, NewSQL solutions like TiDB.
Conclusion
Master‑slave replication and read‑write separation are not silver bullets, but for the vast majority of small‑to‑medium projects they provide a cost‑effective path from a single MySQL instance to a robust, high‑traffic architecture capable of supporting hundreds of millions of users.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ops Community
A leading IT operations community where professionals share and grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
