How MySQL’s Double Write Buffer Prevents Partial Write Failures
This article explains why MySQL data pages can suffer partial write failures during crashes, how the mismatch between InnoDB and OS page sizes contributes to the problem, and how the Double Write Buffer mechanism safeguards data integrity by providing a recoverable copy of each page.
What is Write Failure?
Write failure occurs when a MySQL data page is only partially written to disk due to a system crash or other interruption, resulting in page corruption that cannot be recovered by the redo log because the redo log records physical operations, not complete pages.
Why does it happen?
InnoDB’s page size (default 16 KB) differs from the operating system’s page size (4 KB). When InnoDB writes a page to disk, it must be split into four 4 KB writes. If a crash occurs during this process, only part of the page may be persisted, leading to partial write failure.
If the storage engine is writing a page to disk and a crash occurs, the page may be only partially written, as illustrated below.
How MySQL solves write failure
To address this issue, MySQL introduces the Double Write Buffer (a mechanism of the InnoDB storage engine). The system tablespace contains a dedicated double‑write area where pages are first written before being flushed to their final locations.
The Double Write Buffer consists of 128 pages (2 MB total). In memory it appears as a buffer of 128 pages; on disk it occupies two 1 MB contiguous regions within the system tablespace.
When a dirty page needs to be flushed, it is first copied into the in‑memory Double Write Buffer, then written as a 1 MB sequential block to the system tablespace, which provides high‑performance, contiguous writes.
After the page is safely stored in the system tablespace, the Double Write Buffer writes the dirty page to its final destination tablespace files, which may be non‑contiguous.
This two‑step process—first to the Double Write Buffer, then to the actual tablespace—is called "double write".
If a crash occurs while flushing dirty pages, the recovery process uses the Double Write Buffer. InnoDB locates the complete copy of the corrupted page in the system tablespace’s double‑write area and restores it, after which the redo log is applied to finish crash recovery.
In summary:
Write failure is a partial page write caused by crashes, leading to data loss.
The Double Write Buffer writes pages to a temporary system‑tablespace area before the final write, enabling recovery of corrupted pages.
The buffer is 2 MB, composed of 128 pages, split between memory and disk.
It introduces modest performance overhead, but the added safety is usually worth it.
InnoDB’s redo log and Double Write Buffer work together to ensure durability and crash recovery.
Lobster Programming
Sharing insights on technical analysis and exchange, making life better through technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
