Understanding ACID, Redo & Undo Logs: How InnoDB Guarantees Transaction Safety
This article explains the fundamentals of database transactions, the ACID properties, and how InnoDB uses redo and undo logs together with MVCC to ensure atomicity, consistency, isolation, durability, and reliable crash recovery.
1. Transaction
A transaction consists of one or more database operations that must either all succeed or all fail, requiring rollback on error and persistence on success.
Transactions have four ACID properties.
2. ACID Properties
To illustrate ACID, consider three accounts A (10), B (15) and C (8). A transfers 5 to B (T1) and C transfers 4 to B (T2).
T1: read(A), A=A-5, write(A), read(B), B=B+5, write(B) T2: read(C), C=C-4, write(C), read(B), B=B+4, write(B)2.1 Atomicity
A transaction is an indivisible unit; all its operations must complete or none at all. If a power loss causes only part of T1 to execute, the system must roll back to the original state.
2.2 Consistency
The database must transition from one consistent state to another; the total balance of A, B, and C remains 33 regardless of transfers.
2.3 Isolation
Concurrent transactions must not interfere with each other. Even if T1 and T2 interleave, each transaction's effects remain isolated, preserving correct final balances.
2.4 Durability
Committed changes must be permanently stored. InnoDB writes to a redo log to persist minimal data efficiently instead of flushing entire pages to disk.
3. Redo Log
Redo logs record modifications before the transaction commits, solving two problems: slow random I/O and the overhead of flushing whole pages.
3.1 Redo Log Format
Each redo record contains type, space id, page number, and data.
3.2 Redo Log Buffer
Redo entries are first placed in a memory buffer and flushed to disk at appropriate times.
3.3 Redo Log Files
Redo files are organized into 512‑byte blocks with a 2048‑byte header storing management information such as checkpoint LSN.
Flushing occurs when the buffer is full, on transaction commit, periodically by a background thread, or during normal shutdown.
3.3.3 Crash Recovery
After a crash, InnoDB reads the checkpoint LSN and replays redo logs to restore the database to its pre‑crash state.
4. Undo Log
Undo logs record the original state of modified rows so that a transaction can be rolled back if an error occurs.
4.1 Undo Log Format
InnoDB adds hidden columns trx_id and roll_pointer to clustered index rows; roll_pointer links to undo records forming a version chain.
4.1.2 Insert Operations
Undo records for inserts store only the primary key, enabling deletion during rollback.
4.1.3 Delete Operations
Deletes set a delete‑mark flag; the actual removal occurs after commit, with undo records preserving the old trx_id and roll_pointer.
4.2.4 Update Operations
Updates are handled differently depending on whether the primary key changes; undo logs capture the necessary before‑image information.
4.2 Undo Log Pages
Undo pages store a single type of undo record, either TRX_UNDO_INSERT or TRX_UNDO_UPDATE, and up to four undo page chains may exist per transaction.
4.3 Rollback
During crash recovery, InnoDB uses the rollback segment header to locate undo pages and revert uncommitted changes, preserving atomicity.
5. Transaction Isolation Levels and MVCC
5.1 Common Consistency Issues
Dirty write, dirty read, non‑repeatable read, and phantom read are explained with examples.
5.2 Isolation Levels
SQL defines Read Uncommitted, Read Committed, Repeatable Read, and Serializable, each preventing a subset of the above anomalies.
5.3 MVCC
5.3.1 Version Chain
Each row version links to the previous one via roll_pointer, forming a chain that MVCC traverses.
5.3.2 MVCC and ReadView
ReadView captures creator_trx_id, a list of active transaction IDs (m_ids), min_trx_id, and max_trx_id. Visibility rules determine whether a version is visible to the current transaction.
In Read Committed, a new ReadView is created for each SELECT; in Repeatable Read, a single ReadView is reused for the transaction’s lifetime.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
