How InnoDB Stores Data: Inside MySQL’s Data Pages and Hidden Columns
This article explains how MySQL’s InnoDB engine stores data on disk and in memory, detailing the structure of data pages, the role of hidden columns, record linking, page headers, footers, and directory mechanisms that ensure efficient access and durability.
Preface
If you have used MySQL, you are familiar with its storage engine innodb. Since MySQL 5 the default engine is InnoDB, which is the preferred engine for creating tables.
The article addresses three questions: how InnoDB stores data at the low level, what hidden columns exist in a table, and how user records are linked together.
1. Disk or Memory?
1.1 Disk
Critical data such as IDs, phone numbers, bank accounts, etc., must be persisted. The answer is to store data on disk, with backups or replication to protect against disk failure.
Reading or writing data on disk requires at least two I/O operations, which are time‑consuming and can affect performance.
1.2 Memory
Registers are the fastest but can hold only a tiny amount of data, so they are unsuitable for user data. Memory is slower than registers but still fast enough for most read/write operations, though it is more expensive than disk.
Storing all user data in memory is impractical because of capacity limits and the risk of data loss on crashes. The challenge is to combine durability with speed.
2. Data Page
When writing, data is first placed in a memory batch and then flushed to disk as a whole. When reading, a batch is loaded from disk into memory.
The unit of this batch is the data page. In InnoDB, a data page is typically 16 KB, configurable via innodb_page_size.
A data page stores table records and is linked to other pages by a doubly‑linked list.
A data page consists of the following parts:
File header
Page header
Maximum and minimum records
User records
Free space
Page directory
File trailer
3. User Records
When a new page is allocated, user records are empty. Inserting data allocates part of the free space to user records.
InnoDB supports four row formats; the article uses the compact format as an example.
A user record contains three sections:
Extra information (variable‑length fields, null‑value list, record header)
Hidden columns (row ID, transaction ID, rollback pointer)
Actual data columns
3.1 Extra Information
Extra information assists storage but is not user data.
3.1.1 Variable‑length field list
For columns like VARCHAR or TEXT, InnoDB records the length of each field so that space can be allocated on demand.
3.1.2 Null‑value list
Instead of storing a null value for each column, InnoDB keeps a bitmap where 1 indicates the column may be null, saving space.
3.1.3 Record header
The header stores flags and counters such as deleted_flag, min_rec_flag, n_owned, heap_no, record_type, and next_record.
3.2 Hidden Columns
InnoDB automatically creates three hidden columns: db_row_id (row ID), db_trx_id (transaction ID), and db_roll_ptr (rollback pointer). The row ID is derived from the primary key, a unique non‑null key, or generated automatically if none exist.
3.3 Real Data Columns
These columns store the actual user data and can contain many fields.
3.4 How User Records Are Linked
Records are linked via the next_record pointer stored in the record header, forming a singly‑linked list.
4. Maximum and Minimum Records
Each page contains two special records: Supremum (maximum) and Infimum (minimum). They anchor the singly‑linked list of user records.
5. Page Directory
The page directory speeds up lookups by dividing a page’s records into groups; each group’s maximum record is stored in a slot. Binary search on the slots quickly narrows the target record.
6. File Header and Footer
6.1 File Header
The file header stores critical metadata such as page number, previous page number, next page number, and page type, enabling navigation between pages via a doubly‑linked list.
6.2 File Footer
The footer records a checksum for the page. When a crash occurs during a flush, the checksum mismatch between header and footer indicates incomplete data.
7. Page Header
The page header stores runtime statistics such as the number of deleted bytes, the position of the last inserted record, the maximum transaction ID, index ID, and index level, allowing the engine to quickly locate records without scanning the whole page.
Conclusion
Multiple data pages are linked together by page numbers, forming a doubly‑linked list, while records within a page are linked by the next_record pointer, forming a singly‑linked list. This architecture enables efficient navigation and durability in InnoDB.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
