Databases 24 min read

Inside InnoDB: How MySQL Stores Data, Row Formats, and Indexes Explained

This article breaks down MySQL's InnoDB storage engine, detailing where data files reside, the different row formats (compact, redundant, dynamic, compressed), the internal 16 KB page layout, record header fields, overflow handling, and how B‑tree indexes (clustered and secondary) are built and searched.

ITPUB
ITPUB
ITPUB
Inside InnoDB: How MySQL Stores Data, Row Formats, and Indexes Explained

InnoDB Overview

MySQL stores all data on disk, but the exact files and layout depend on the storage engine. InnoDB is the default engine and the focus of this article.

Row Formats

InnoDB can store rows in four formats: compact , redundant , dynamic and compressed . Each format defines how variable‑length columns, NULL values and the record header are stored.

The row format can be specified when creating or altering a table:

-- Create a table with an explicit row format
CREATE TABLE table_name (col1 INT, col2 VARCHAR(100)) ROW_FORMAT=compact;

-- Change the row format of an existing table
ALTER TABLE table_name ROW_FORMAT=dynamic;

-- Show the row format of a table
SHOW TABLE STATUS LIKE 'table_name';

Defaults have evolved: redundant before MySQL 5.0, compact from 5.0, and dynamic from 5.7.

Compact Format

Stores a list of lengths for all variable‑length columns, a NULL‑value bitmap, and a fixed‑size record header. The length list contains only non‑NULL columns.

Variable‑length field list – stores the byte length of each non‑NULL variable column.

NULL‑value bitmap – a bit per nullable column; 1 means NULL.

Record header – includes flags such as delete_mask, min_rec_mask, n_owned, heap_no, record_type, next_record.

Redundant Format

Older format (pre‑5.0) that omits the variable‑length list and NULL bitmap, using an offset table for the actual column data. It is rarely used today.

Dynamic Format

Same as compact but handles overflow pages differently: large column values (> 768 bytes) are stored on separate overflow pages, and the main record holds a 20‑byte pointer to those pages.

Compressed Format

Builds on dynamic format by compressing overflow pages with zlib. It saves space for large BLOB/TEXT columns but incurs extra CPU overhead and is not recommended for most workloads.

Data Page Structure

InnoDB reads and writes data in 16 KB pages. A page is divided into several logical parts:

File Header (38 bytes) – common information for all page types.

Page Header (56 bytes) – page‑specific metadata.

Infimum + Supremum (26 bytes) – virtual records that mark the start and end of the user records.

User Records – the actual rows stored by the table.

Free Space – unused area within the page.

Page Directory – indexes the groups of records for fast binary search.

File Trailer (8 bytes) – checksum for page integrity.

When inserting rows, InnoDB first allocates space from the free area, then places the record in User Records. Once the free space is exhausted, a new page is allocated.

Record Header Fields (Selected)

delete_mask – marks a record as deleted; deleted rows stay on disk until reclaimed.

min_rec_mask – bitmap for NULL columns; omitted if no nullable columns exist.

n_owned – number of records owned by the record that ends a group (used by the page directory).

heap_no – position of the record within the page heap (0 and 1 are pseudo‑records for infimum and supremum).

record_type – 0 = ordinary, 1 = B+‑tree internal, 2 = minimum, 3 = maximum.

next_record – byte offset from the current record to the next record in the same group.

Overflow Handling

If a column value exceeds 768 bytes, InnoDB stores the first 768 bytes in the main record and the remainder on an overflow page. The main record holds a 20‑byte pointer to that overflow page. The overflow threshold is derived from the page size: a page must contain at least two rows, and each row needs 27 bytes of overhead, leading to a practical limit of about 8 KB for a column before overflow occurs.

Indexing and B+‑Tree Structure

All pages form a doubly‑linked list. Within a page, records are linked by primary key order, forming a singly‑linked list. The Page Directory groups records and enables binary search on the primary key.

Primary‑key look‑up proceeds in two steps:

Binary search the page directory to locate the slot (group) containing the target key.

Traverse the next_record chain inside that group to find the exact row.

Secondary (non‑primary) indexes are also B+‑trees but differ:

They sort by the indexed column (e.g., a2) instead of the primary key.

Leaf nodes store only the indexed column value plus the primary key.

Directory entries contain the indexed column value and the page number.

Page Splits and Multi‑Level Directories

When a page becomes full, InnoDB splits it: the existing records are divided into two groups (typically 4 + 5 records), a new slot is added to the directory, and the groups are linked accordingly. For tables with many pages, InnoDB builds higher‑level directory pages, forming a multi‑level B+‑tree that maps key ranges to leaf pages.

Cost of Indexes

Space cost : each index creates a full B+‑tree; every node occupies a 16 KB page.

Time cost : INSERT/UPDATE/DELETE operations must maintain all indexes, causing page splits, record moves, and additional writes.

Conclusion

By dissecting InnoDB’s storage logic—row formats, page layout, record headers, overflow handling, and B+‑tree indexing—developers gain a clear picture of how MySQL persists data and can make more informed decisions when designing schemas and indexes.

InnoDB storage overview
InnoDB storage overview
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

IndexingInnoDBMySQLB+TreeDatabase Storagerow formatPage Layout
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.