Why MySQL Can Lose Data: Engine Layer Risks and Replication Inconsistencies Explained
This article examines the various scenarios in which MySQL can lose data, covering InnoDB and MyISAM engine behaviors, flush policies, master‑slave replication pitfalls, XA transaction handling, and practical solutions to improve consistency and availability.
Engine‑Level Data Loss Scenarios
MySQL stores data using different storage engines, each with its own durability guarantees. When the engine’s write‑ahead log (WAL) is not flushed to disk promptly, data can be lost after a crash.
1.1 InnoDB Data‑Loss Analysis
InnoDB uses redo and undo logs. Transactions write changes to memory (dirty pages) and record operations in the redo log. The dirty pages are flushed to disk by the checkpoint mechanism, while the redo log is flushed according to the innodb_flush_log_at_trx_commit setting. innodb_flush_log_at_trx_commit = 0: Write to OS cache every second and flush to disk; risk of losing up to one second of committed and uncommitted transactions after a crash. innodb_flush_log_at_trx_commit = 1: Safest; each commit forces a write and flush to disk, but performance is lower. innodb_flush_log_at_trx_commit = 2: Write to OS cache on each commit, flush to disk based on innodb_flush_log_at_timeout (default 1 s). Provides ~10× higher DML throughput at the cost of potentially losing committed data within the timeout window after a crash.
In high‑throughput environments, many operators set the parameter to 2, accepting the risk of losing a few seconds of data for a significant performance gain.
1.2 MyISAM Data‑Loss Analysis
MyISAM does not support transactions and lacks a data cache; all writes go directly to the OS cache. If the server crashes, any data not yet flushed by the OS is lost, making MyISAM unsuitable for critical data.
2 Replication‑Induced Data Inconsistency
MySQL replication relies on binary logs (binlog) written on the master and applied on the slave. The sync_binlog parameter controls how often the binlog is flushed to disk. sync_binlog = 0: No explicit flushing; OS decides. sync_binlog = N: Flush after every N transactions; 1 is safest but incurs high I/O.
Typical production settings use sync_binlog = 100 to balance safety and performance.
2.2 Internal XA Transaction Mechanism
When a transaction involves both InnoDB and binlog, MySQL performs an XA‑style two‑phase commit:
InnoDB writes a PREPARE record (XID) to its redo log.
Binlog is written.
InnoDB writes the final commit record to redo log.
If steps 1 or 2 fail, the transaction rolls back. If step 3 fails, the server checks the PREPARE XID on restart and may re‑commit, ensuring consistency between redo and binlog.
2.3 Inconsistent Data Due to Non‑Real‑Time Flushes
When redo logs or binlog are not flushed in real time, two main inconsistency cases arise after a crash:
Redo PREPARE not written but binlog is; the slave may contain more rows than the master.
Both redo PREPARE and commit are written but binlog is not; the slave may contain fewer rows than the master.
Resolving this requires configuring both redo log and binlog to flush synchronously, often at the cost of I/O performance; SSDs can mitigate the impact.
2.4 Slave‑Side Flush Settings
The slave stores three files after reading the master’s binlog: relay log, relay log info, and master info. If any of these are not persisted before a system crash, data divergence can occur.
From MySQL 5.6.2 onward, these metadata can be stored in InnoDB tables instead of files, controlled by:
master-info-repository = TABLE relay-log-info-repository = TABLEFlushing is governed by three parameters: sync_relay_log (default 10000, 0 = OS‑controlled) sync_master_info (default 10000, behavior depends on repository type) sync_relay_log_info (default 10000, similar behavior)
Setting all three to 1 forces an fsync on every event, which can become a severe I/O bottleneck when combined with real‑time binlog flushing.
2.5 Master Failure and Recovery Delays
If the master crashes and binlog updates have not yet reached the slaves, or if network issues cause inconsistent binlog delivery, several problems arise:
Applications must continue to read/write the master, affecting read‑write split scenarios.
Promoting a slave to master (e.g., via MHA) may lose binlog entries that never arrived on the promoted slave, leading to data gaps.
When the original master recovers, its binlog may contain entries that the new master omitted, causing further divergence.
Common mitigation strategies include:
Semi‑synchronous replication : Requires at least one slave to acknowledge receipt before the master commits. Improves consistency but adds latency.
Dual‑write binlog : Replicate binlog at the OS layer to a standby server or shared storage. Introduces complexity and potential split‑brain issues.
Asynchronous message queues : Write to the database first, then asynchronously replicate changes via a queue, allowing eventual consistency with lower impact on transaction latency.
Use of MHA when occasional data loss is acceptable, enabling automatic master failover based on the most up‑to‑date slave.
3 Summary
MySQL data‑loss scenarios span engine‑level flush policies, replication lag, and recovery procedures. InnoDB’s innodb_flush_log_at_trx_commit setting balances safety and performance, while MyISAM offers no durability guarantees. Replication consistency hinges on real‑time flushing of both redo logs and binlogs, proper XA transaction handling, and careful configuration of slave metadata sync parameters. Solutions range from semi‑synchronous replication to dual‑write mechanisms and message‑queue‑based designs, each trading off latency, complexity, and fault tolerance.
According to the CAP theorem, a distributed MySQL deployment can achieve high availability and strong consistency (with semi‑sync), but cannot simultaneously guarantee partition tolerance without sacrificing one of the other properties.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
