Databases 14 min read

Understanding Data Consistency in MySQL Semi‑Synchronous Replication and HA Failover

This article explains the principles of MySQL semi‑synchronous replication, analyzes how data consistency is maintained during high‑availability failover, presents detailed step‑by‑step transaction flow, discusses scenarios causing GTID divergence, and offers testing methods and remediation techniques for DBA practitioners.

Aikesheng Open Source Community

Aug 25, 2022

Understanding Data Consistency in MySQL Semi‑Synchronous Replication and HA Failover

Understanding MySQL Semi‑Synchronous Replication

MySQL 5.7 enables semi‑synchronous replication by default. During a transaction commit, the master writes the binlog and must receive an ACK from at least one slave before proceeding. If no ACK arrives within the timeout, the system falls back to asynchronous replication.

Configuration for Reliable Semi‑Sync

sync_binlog=1
innodb_flush_log_at_trx_commit=1
...(etc.)

The author asserts that these settings provide the most reliable semi‑sync configuration.

Key Terminology

Terms such as lossless semi‑sync, enhanced semi‑sync, and the parameter rpl_semi_sync_master_wait_point=AFTER_SYNC all refer to the same mode, which avoids data loss after a high‑availability switch.

Potential Inconsistency Scenarios

Two main cases can cause data divergence after a master‑slave switch:

Old master retains more GTIDs than the new master (the typical case discussed).

New master may have more GTIDs if sync_binlog is not set to 1, due to unflushed binlog entries or timing differences.

The article breaks down the replication process into phases A and B, with sub‑phases 2aa and 2ab, illustrating how GTID gaps arise.

Testing Methodology

Set up a one‑master‑one‑slave semi‑sync cluster.

Run sysbench to generate load (up to 800 TPS after tuning).

Kill the master process with kill -9 mysqld.

Prevent automatic restart, then compare GTID sets on master and slave.

The test eventually reproduced a situation where the old master had three extra GTIDs.

# mysql -uadmin -pGta@2019 -S /database/mysql/data/3306/mysqld.sock -e "show slave status\G" | grep "ffc43852-1d82-11ed-a65f-000c29375703"
Master_UUID: ffc43852-1d82-11ed-a65f-000c29375703
Retrieved_Gtid_Set: ffc43852-1d82-11ed-a65f-000c29375703:210837-283030
Executed_Gtid_Set: ffc43852-1d82-11ed-a65f-000c29375703:1-283030

Parsing the master binlog revealed three additional GTID statements highlighted in red.

# cat 16.txt | grep GTID | grep "ffc43852-1d82-11ed-a65f-000c29375703:28303"
SET @@SESSION.GTID_NEXT='ffc43852-1d82-11ed-a65f-000c29375703:283030';
SET @@SESSION.GTID_NEXT='ffc43852-1d82-11ed-a65f-000c29375703:283031';
SET @@SESSION.GTID_NEXT='ffc43852-1d82-11ed-a65f-000c29375703:283032';
SET @@SESSION.GTID_NEXT='ffc43852-1d82-11ed-a65f-000c29375703:283033';

Why the Scenario Is Hard to Simulate

The 2aa window is very short; high TPS and slow I/O (e.g., using a deliberately slow disk for the binlog) increase the chance of reproducing the issue.

Repair Strategies

Restart before failover: If the master restarts quickly, it can catch up without a switch, keeping GTIDs synchronized.

Catch‑up after failover: Use MHA or similar tools to let the new master pull missing binlogs from the old master.

Flashback the old master: Roll back extra GTIDs on the old master before re‑adding it as a replica.

Both “slave catch‑up” and “master rollback” are valid; the choice depends on the operational context.

Conclusion

Under lossless semi‑synchronous replication, business‑level data appears consistent after a high‑availability switch, but underlying binlog/GTID differences can exist. DBAs must understand the replication phases, be able to reproduce the edge cases, and apply appropriate remediation to ensure true data consistency.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

mysql DBA GTID database-consistency Semi‑Sync Replication

Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.