How MySQL Crash Recovery Works: Inside Redo Logs, Doublewrite, and Undo Tablespaces
This article explains MySQL's crash recovery process, covering how the doublewrite buffer repairs corrupted pages, how redo logs are applied to restore data, and how undo tablespaces and unfinished transactions are handled during server startup.
Overview
MySQL crash recovery is essentially a forced shutdown; during a normal shutdown InnoDB performs a series of cleanup tasks (e.g., undo log cleanup, change buffer merge) that are controlled by the innodb_fast_shutdown system variable. When the server crashes these tasks are postponed to the next startup, and the recovery process described below is executed.
Reading Doublewrite Pages
When a crash occurs, the Redo log must restore dirty pages to their state just before the crash. This requires that the pages be intact; if a page is partially written, the Redo log alone cannot recover it. MySQL uses the doublewrite mechanism (officially called double write ) to protect against such partial writes. Doublewrite consists of an in‑memory buffer and a .dblwr file. Before flushing a dirty page to disk, InnoDB first writes the page to the memory buffer and then to the .dblwr file; only after both writes succeed is the page flushed to the tablespace.
The doublewrite feature is controlled by the innodb_doublewrite variable, which is ON by default.
ls -l | grep dblwr
-rw-r----- 1 csch staff 192K Aug 27 12:04 #ib_16384_0.dblwr
-rw-r----- 1 csch staff 8.2M Aug 1 16:29 #ib_16384_1.dblwrDuring startup, MySQL loads all pages from the .dblwr files into the doublewrite memory buffer, uses them to repair any corrupted pages, and then applies the Redo log.
Recovering Data Pages
Applying the Redo log to data pages (step 4) requires first reading the Redo log (step 3). The starting point for reading the Redo log is the last_checkpoint_lsn , which is stored in the second or fourth block of the first Redo log file, depending on the checkpoint number's parity.
Each Redo‑log block is 512 bytes: 12 bytes header, 4 bytes trailer, and 496 bytes of actual log data. InnoDB reads the log in 64 KB chunks (4 × innodb_page_size) into a log buffer, then copies each block to a 2 MB parsing buffer for integrity checking.
The parsing step extracts four fields from each Redo record: log type, tablespace ID, page number, and the log payload. These fields are stored in a nested hash table where the first level key is the tablespace ID and the second level key is the page number. The value is a linked list of Redo records ordered by their generation time.
When applying the Redo log, InnoDB iterates over the hash table, loads the required pages into the buffer pool (using asynchronous pre‑read of up to 32 pages at a time), and for each page checks the FILE_PAGE_LSN from the page header. Only Redo records with a start LSN greater than or equal to FILE_PAGE_LSN are applied; older records are skipped because the page already contains those changes.
Deleting Undo Tablespaces
During normal operation large transactions can cause undo tablespaces to grow. In older MySQL versions undo tablespaces could not shrink, but modern versions use the innodb_undo_log_truncate variable (default ON ) together with innodb_max_undo_log_size (default 1 GB) to truncate undo tablespaces back to the initial 16 MiB size.
If a truncation is in progress when the server crashes, a marker file named undo_<em>n</em>_trunc.log is left behind. The presence of this file indicates an incomplete truncation. The file contains a magic number (76845412) when truncation finishes; if the number is missing, the undo tablespace must be deleted and rebuilt on the next startup.
ls -l | grep undo
-rw-r----- 1 csch staff 16M Aug 27 12:04 undo_001
-rw-r----- 1 csch staff 16M Aug 27 12:04 undo_002
-rw-r--r-- 1 csch staff 16K Jun 22 12:36 undo_1_trunc.logManual truncation can also be triggered with:
ALTER UNDO TABLESPACE tablespace_name SET INACTIVE;Initializing Transaction Subsystem
After undo tablespaces are deleted (or truncated), InnoDB initializes the transaction subsystem. It reads undo tablespaces, each of which may contain up to 128 rollback segments (controlled by innodb_rollback_segments , default 2). For each rollback segment, InnoDB reads the segment header page (page 3) and then iterates over the 1024 undo slots on the segment‑head page.
If a slot contains FIL_NULL , it is empty; otherwise it points to an undo segment that holds information about an unfinished transaction. The transaction can be in one of three states:
TRX_STATE_ACTIVE – the transaction never reached the prepare stage.
TRX_STATE_PREPARED – the transaction completed the first phase of a two‑phase commit and is waiting for the final commit.
TRX_STATE_COMMITTED_IN_MEMORY – the transaction finished both phases but some cleanup work is still pending.
These states determine how the transaction will be processed later.
Rebuilding Undo Tablespaces
When a truncation was interrupted, InnoDB recreates the undo tablespace. The steps are:
Create a new trunc.log file to mark the rebuild operation.
Create the undo tablespace file with an initial size of 16 MiB.
Initialize the tablespace header (page 0) with the tablespace ID and linked‑list structures, then allocate a new data page and create the first rollback segment.
Iterate over all rollback segments and set each of their 1024 undo slots to FIL_NULL .
Write the magic number 76845412 to the trunc.log file to indicate successful rebuild, then delete the file.
If multiple undo tablespaces need rebuilding, the process repeats for each one.
Processing Transactions
The transaction‑processing phase consists of four sub‑steps:
Clean committed transactions (state TRX_STATE_COMMITTED_IN_MEMORY ). Insert‑undo segments that can be cached are added to the insert_undo_cached list; otherwise they are released. The transaction is removed from the read‑write transaction list and its state is set to TRX_STATE_NOT_STARTED .
Rollback uncommitted DDL transactions (state TRX_STATE_ACTIVE but originating from DDL). Although MySQL does not allow users to roll back DDL, the engine can roll back the internal changes.
Rollback uncommitted DML transactions (also TRX_STATE_ACTIVE ). The rollback logic is complex and is omitted here.
Handle PREPARE transactions (state TRX_STATE_PREPARED ). Whether a PREPARE transaction is committed or rolled back depends on whether its XID appears in the last binlog file. InnoDB scans the last binlog, collects all XID_EVENT entries, and builds a set of XIDs. For each PREPARE transaction, if its XID is in the set, the transaction is committed; otherwise it is rolled back. The binlog’s first event ( FORMAT_DESCRIPTION_EVENT ) contains the LOG_EVENT_BINLOG_IN_USE_F flag, which is cleared on normal shutdown. If the flag is still set after a crash, InnoDB knows that the last binlog may contain unflushed XID events and proceeds with the PREPARE‑handling steps.
Summary
The core work of MySQL crash recovery consists of two parts:
Repairing corrupted pages using the doublewrite buffer and then applying Redo log records to bring all dirty pages to the state they had just before the crash.
Cleaning up unfinished transactions: committing or rolling back based on their state and the presence of XID events in the binlog, and finalising the two‑phase‑commit transactions.
These steps ensure data consistency and allow the server to resume normal operation after an unexpected shutdown.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
