Master MySQL Redo, Binlog & Undo Logs: Crash Recovery and Two‑Phase Commit
This article explains MySQL's redo log, binlog, and undo log, covering their roles in crash recovery, durability, performance optimization, log formats, two‑phase commit handling, and how they together ensure data consistency and atomicity in InnoDB.
Preface
Recently a discussion in a Geek Planet group raised questions about MySQL logs, which mainly include error log, query log, slow query log, transaction log, and binary log. Among them, the binary log (binlog) and transaction logs (redo log, undo log) are especially important.
Redo Log
redo logis exclusive to the InnoDB storage engine, giving MySQL crash‑recovery capability. When a MySQL instance crashes and restarts, InnoDB uses the redo log to restore data, ensuring durability and integrity.
Data is read from disk into the Buffer Pool page by page; subsequent queries first look in the Buffer Pool, reducing I/O and improving performance. Updates modify pages in the Buffer Pool and record the changes in the redo‑log buffer, which is later flushed to the redo log file.
Flushing timing is controlled by the innodb_flush_log_at_trx_commit parameter, which supports three strategies:
0 : No flush on transaction commit.
1 : Flush on every commit (default).
2 : Write the redo‑log buffer to the page cache on commit.
InnoDB also runs a background thread every second that writes the redo‑log buffer to the page cache and then calls fsync. The buffer may also be flushed when it reaches half of innodb_log_buffer_size. In addition, when the buffer is full, the thread forces a flush.
Tip: each redo record consists of “tablespace ID + page number + offset + length + modified data”.
Flushing Timing
The engine provides three options via innodb_flush_log_at_trx_commit:
0 : No fsync on commit; the system decides when to flush.
1 : fsync on every commit, guaranteeing that redo log is persisted.
2 : Write to page cache on commit and fsync later, improving performance but risking up to one second of data loss on crash.
Binlog
binlogis a logical log that records the original SQL statements (e.g., update T set c=1 where id=2). It is essential for backup, master‑slave replication, and data consistency across MySQL clusters.
Binlog can be written in three formats, controlled by the binlog_format variable:
statement : Stores the SQL text.
row : Stores the actual row changes, ensuring consistency for statements like update_time=now().
mixed : Uses row format when needed, otherwise statement format.
When using row format, the binlog records include the original values of each column (e.g., @1, @2, @3) and the new values, which can be parsed with mysqlbinlog. Although more space‑intensive, row format avoids inconsistencies caused by non‑deterministic functions.
Two‑Phase Commit
InnoDB uses a two‑phase commit (prepare and commit) to keep redo log and binlog consistent. If binlog writing fails after the redo log has been prepared, MySQL detects the unfinished transaction during recovery and rolls it back, preventing data divergence.
When a failure occurs during the commit phase of the redo log, the transaction is not rolled back; MySQL uses the transaction ID to locate the corresponding binlog entry. If the binlog entry is missing, the transaction is considered incomplete and is rolled back; otherwise it is committed.
Undo Log
The undo log guarantees atomicity by recording before‑image changes. It works together with MVCC, hidden fields, and read views to provide consistent snapshots and enable rollback when a transaction aborts.
Summary
MySQL InnoDB relies on redo log for durability, undo log for atomicity, and binlog for replication and backup, ensuring data consistency and crash‑recovery capabilities.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
