InnoDB Storage Architecture and Transaction Persistence Mechanisms
The article explains MySQL InnoDB’s internal storage hierarchy—from tablespaces, segments, and extents to pages and rows—its physical components such as buffer pool, change buffer, adaptive hash index, redo and binary logs, persistence techniques like double‑write buffering and log‑flushing policies, and how two‑phase commit and SQL execution interact, helping developers write correct, high‑performance applications.
Background and Goal MySQL is widely used in the Internet industry. Understanding InnoDB's internal storage model, persistence strategy, and transaction implementation helps developers write correct and efficient code.
InnoDB Logical Storage Structure The hierarchy is tablespace → segment → extent → page → row. A tablespace contains multiple segments; each segment consists of extents (default 64 pages = 1 MiB), and each page is 16 KB.
InnoDB Physical Storage Structure Includes memory structures (buffer pool, change buffer, adaptive hash index, etc.) and disk structures (tablespaces, pages, file headers/trailers). Pages contain a file header, page header, system records (infimum & supremum), user records, page directory, and file trailer.
Tablespace Types System tablespace, file‑per‑table tablespace, undo tablespace, general tablespace, and temporary tablespace. The InnoDB_file_per_table variable controls whether each table gets its own .ibd file.
Segments and Extents Segments are logical collections of pages (data, index, rollback). Extents ensure page contiguity on disk.
Rows and Row Formats Rows are limited to 64 KB. InnoDB supports REDUNDANT, COMPACT, DYNAMIC, and COMPRESSED row formats, each with different storage characteristics. The current format can be queried with show variables like "InnoDB_default_row_format"; .
Buffer Pool Caches index pages, data pages, undo pages, insert buffer, adaptive hash index, lock information, and data dictionary. It maintains three lists: free list, flush list, and LRU list. Pages are classified as free, clean, or dirty.
Change Buffer Stores modifications for pages not currently in the buffer pool, reducing random I/O. It occupies up to 25 % of the buffer pool (max 50 %).
Adaptive Hash Index (AHI) Automatically creates hash indexes for hot data based on access patterns, improving read/write performance.
Log Buffer and Redo Log The log buffer holds redo log records in memory; redo logs are persisted to ib_logfile0 , ib_logfile1 , etc. Redo logs follow the Write‑Ahead Logging (WAL) principle: logs are flushed before dirty pages are written to disk.
Binlog A logical binary log at the MySQL server layer. It records statements in ROW, STATEMENT, or MIXED format and is used for replication and point‑in‑time recovery.
Persistence Strategies InnoDB uses double‑write buffering to protect against partial page writes. The buffer pool is flushed under several conditions (shutdown, periodic flush, low free list, etc.). Log flushing is controlled by innodb_flush_log_at_trx_commit (values 0, 1, 2).
Two‑Phase Commit (XA) MySQL implements XA for distributed transactions. The prepare phase forces all involved resources to flush their logs; the commit phase finalizes the transaction. Binlog acts as the transaction coordinator, while InnoDB is the transaction manager.
SQL Execution Flow An example UPDATE statement shows how the server parses the query, the storage engine fetches the row, writes undo and redo logs, updates the buffer pool, and finally commits through the prepare and commit phases.
Conclusion Proper understanding of InnoDB's storage structures, buffer management, and logging mechanisms is essential for building reliable, high‑performance MySQL applications.
DeWu Technology
A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.