Implementing MySQL Master‑Master High Availability with Keepalived: A Step‑by‑Step Guide
This article provides a comprehensive, English‑language tutorial on building MySQL master‑master high availability using Keepalived, covering architecture design, Docker‑based MySQL deployment, replication configuration, Keepalived installation, virtual IP setup, failover testing, and a detailed list of encountered pitfalls and their solutions.
Introduction
MySQL serves as a crucial storage medium for many business systems; when it crashes, both read and write operations are severely affected. This article shares a practical implementation of MySQL master‑master high availability using Keepalived, including architecture, configuration steps, and common pitfalls.
Solution Overview
The solution consists of two MySQL instances in master‑master mode monitored by Keepalived, which provides a virtual IP (VIP) and automatic failover or service restart.
1. Architecture
Two MySQL servers synchronize data via bidirectional replication. Keepalived runs on each host, detects MySQL health, restarts the service if needed, and moves the VIP to the healthy node.
2. Master‑Master Replication Principle
Each server acts as both master and slave, replicating changes in both directions. The replication process involves writing changes to the binary log (binlog) on the master, the slave's I/O thread fetching the binlog, a dump thread sending events, and the SQL thread applying them.
3. Setting Up the MySQL Environment
Two Ubuntu virtual machines with Docker are used. The MySQL image is saved, transferred, and loaded on both hosts:
sudo docker save -o mysql.tar hcr:5000/hschub/hscmysql:0.0.2
sudo chmod 777 mysql.tar
sudo docker load -i mysql.tarContainers are started with volume mappings:
sudo docker run -p 3306:3306 --name mysql \
-v /home/hss/mysql/data:/var/lib/mysql \
-v /home/hss/mysql/etc/mysql:/etc/mysql \
-e MYSQL_ROOT_PASSWORD='123456' -d 46b4. Configuring MySQL Master‑Slave (Base for Master‑Master)
On the primary node (node1, IP 192.168.56.11) and the secondary node (node2, IP 192.168.56.12) the my.cnf files are edited to set unique server_id , enable binary logging, and configure relay logs. Directories for logs are created with 777 permissions and the containers are restarted.
server_id = 11
log_bin = /var/lib/mysql/log/mysql-bin
binlog_format = mixed
log_slave_updates = 1
relay_log = /var/lib/mysql/log/relay-bin
...Replication accounts are created on the master:
CREATE USER 'vagrant'@'192.168.56.12' IDENTIFIED BY 'vagrant';
GRANT REPLICATION SLAVE ON *.* TO 'vagrant'@'192.168.56.12';
FLUSH PRIVILEGES;After locking tables and noting the binary log file and position, a full dump is taken and restored on the slave.
5. Converting to Master‑Master
Both nodes are configured as masters by swapping the master‑info and relay‑log settings, then starting the replication threads with START SLAVE; . Verification is done via SHOW SLAVE STATUS\G and by inserting data on one node and confirming it appears on the other.
6. Installing and Configuring Keepalived
Keepalived is compiled from source (version 2.2.2) on Ubuntu, with required dependencies installed via apt‑get install . After ./configure , make && make install is executed.
The configuration file /etc/keepalived/keepalived.conf defines a virtual IP (192.168.56.88), a VRRP instance, and a script /usr/local/keepalived/restart_mysql.sh that checks MySQL health, attempts a restart, and disables Keepalived if the restart fails.
#!/bin/bash
START_MYSQL="docker restart mysql"
STOP_MYSQL="docker stop mysql"
LOG_FILE="/usr/local/keepalived/logs/mysql-check.log"
HAPS=`ps -C mysqld --no-header | wc -l`
if [ $HAPS -eq 0 ]; then
echo $START_MYSQL >> $LOG_FILE
$START_MYSQL >> $LOG_FILE 2>&1
sleep 3
if [ `ps -C mysqld --no-header | wc -l` -eq 0 ]; then
echo "start mysql failed, killall keepalived" >> $LOG_FILE
killall keepalived
fi
fiPermissions are set on the script and log directory, and Keepalived is started on both nodes with systemctl start keepalived . The process can be verified via ps -ef | grep keepalived and log inspection.
7. Testing Failover
Stopping the MySQL container on one node triggers Keepalived to restart it within a few seconds. If the restart fails, Keepalived stops, and the VIP automatically points to the other MySQL instance, which can be confirmed by querying SHOW VARIABLES LIKE '%hostname%' from a client.
8. Common Pitfalls and Solutions
Incorrect MySQL passwords – use skip-grant-tables to reset.
Missing volume mappings – ensure host directories are correctly mounted.
Permission issues on data directories – apply chmod 777 recursively.
Dependency installation failures – update apt sources or downgrade conflicting packages.
Keepalived service masked – unmask with systemctl unmask keepalived and create missing /etc/rc.d/init.d/functions link.
Script not executable – set chmod +x and enable script_security if needed.
References to external articles and blogs are listed at the end of the original source.
Wukong Talks Architecture
Explaining distributed systems and architecture through stories. Author of the "JVM Performance Tuning in Practice" column, open-source author of "Spring Cloud in Practice PassJava", and independently developed a PMP practice quiz mini-program.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.