Databases 8 min read

Understanding MySQL High Availability, Master‑Slave Replication Delay, and Switch Strategies

This article explains MySQL high availability concepts, how master‑slave replication works, how to measure and mitigate replication lag, and compares reliable‑first and availability‑first failover strategies with practical experiments on binlog formats.

IT Services Circle
IT Services Circle
IT Services Circle
Understanding MySQL High Availability, Master‑Slave Replication Delay, and Switch Strategies

MySQL is widely used, and its transaction features, isolation levels, and indexes are well known; this article dives deeper into MySQL high availability (HA) and the mechanisms that keep a database running without interruption.

According to Wikipedia, high availability means a system can perform its functions continuously, usually achieved by improving fault tolerance. In MySQL, HA is realized through master‑slave replication, where the master writes to a binary log (binlog) and the slave synchronizes data in real time.

The replication process includes:

Clients perform read/write operations on the master.

The master streams binlog entries to the slave.

If the master fails (e.g., disk crash), an automatic master‑slave switch occurs, promoting the slave to a new master.

Clients then read/write from the new master.

Because data synchronization is asynchronous, a master‑slave delay inevitably exists. The delay is calculated as t3 - t1, where t1 is the time the master writes to the binlog, t2 is when the slave receives the binlog, and t3 is when the slave applies it. The MySQL command show slave status shows the seconds_behind_master value, indicating the current delay in seconds.

If the master and slave clocks differ, the slave adjusts the delay by subtracting the time difference obtained via SELECT UNIX_TIMESTAMP() from the master.

Common causes of replication lag include:

Significant hardware performance gaps between master and slave.

Heavy read workloads or other background jobs on the slave.

Large transactions that take long to replay on the slave.

Mitigation strategies involve upgrading slave hardware, offloading non‑essential workloads, and breaking large transactions into smaller batches (e.g., using incremental deletes).

When the master becomes unavailable, two failover strategies are discussed:

1. Reliable‑first

The system waits until seconds_behind_master falls below a threshold (e.g., 4 s), sets the master to read‑only, ensures the slave catches up (delay becomes 0), then promotes the slave to read/write. This guarantees no data loss but may cause a temporary write outage.

2. Availability‑first

The traffic is switched to the slave immediately, even if the slave is still lagging. This keeps the service online but can lead to data inconsistency because the slave may still be processing old binlog entries.

Two experiments illustrate the impact of binlog formats:

CREATE TABLE `person` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(32),
  PRIMARY KEY (`id`)
) ENGINE=InnoDB;

Insert two rows:

insert into person(name) values ("tom");
insert into person(name) values ("jerry");

Experiment 1 sets binlog_format=row, which records full row data and can expose primary‑key conflicts during sync. Experiment 2 switches to statement or mixed mode, which preserves row counts but may cause primary‑key disorder.

The conclusion recommends the reliable‑first approach for data correctness, while emphasizing the need to minimize replication delay so that failover can happen quickly and the service can recover promptly.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

databasehigh availabilitymysqlBinlogReplicationfailoverMaster‑Slave Delay
IT Services Circle
Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.