Unlocking InnoDB: Page Size, Row Formats, and Varchar Limits Explained
This article explains what InnoDB does, how its page‑based I/O causes the first query to be slower, the default Dynamic row format, how varchar length is stored, the practical limits of varchar(M) under utf8mb4, and the impact of metadata and internal fragmentation on row space.
What InnoDB Is
InnoDB is the storage engine that writes table data to disk for MySQL. It stores rows in fixed‑size pages (default 16 KB) and moves data between memory and disk page‑by‑page.
How InnoDB Reads and Writes Data
When a query needs data, InnoDB loads the relevant pages from disk into memory; when a write occurs, modified pages are flushed back to disk. Because disk I/O is orders of magnitude slower than memory, the first request for a page incurs a noticeable latency (typically 300‑400 ms), while subsequent requests hit the cached page in memory and are much faster (30‑40 ms).
Each page is the basic unit of interaction; a 16 KB page is read or written as a whole, even if the query only needs a few rows.
Varchar and InnoDB Row Formats
Rows are stored according to a row format. MySQL 5+ uses the Dynamic format by default. A row consists of three logical parts:
Actual column data.
Variable‑length field length list (stores the byte length of each varchar, varbinary, text, blob column).
NULL‑value list (a bitmap indicating which columns are NULL).
The length list and NULL list are metadata; they occupy space in the row and affect the usable payload.
How InnoDB Determines Varchar Length
For a varchar column, InnoDB records the real byte length L of the stored value. It allocates up to two bytes for L. Because the maximum value that two bytes can represent is 2¹⁶‑1 = 65 535, the maximum byte count a varchar can store is 65 535. With the utf8mb4 character set (up to 4 bytes per character), the theoretical maximum number of characters is ⌊65 535 / 4⌋ = 16 383.
Practical Example
CREATE TABLE test (
c1 VARCHAR(10),
c2 VARCHAR(10) NOT NULL,
c3 CHAR(10),
c4 VARCHAR(10)
) CHARSET=utf8mb4;Inserting two rows:
INSERT INTO test (c1, c2, c3, c4) VALUES
('aaaa', '你好啊', 'cc', 'd'),
('eeee', 'fff', NULL, NULL);The first row’s variable‑length list stores three lengths (c1, c2, c4) – 4 bytes, 9 bytes, and 1 byte – each fitting in a single byte, so the list occupies 3 bytes. The second row stores only c1 and c2, so its list occupies 2 bytes.
Why Varchar(16383) Cannot Store 16 383 Characters
Even though 16 383 characters would require at most 65 532 bytes (16 383 × 4), each row also needs metadata (record header, NULL list, length list) and may suffer internal fragmentation. Consequently, the actual usable space per row is less than the raw page size, so a varchar(16383) column cannot reach its theoretical limit.
Internal Fragmentation
InnoDB pages are fixed at 16 KB. If the data stored in a page does not completely fill the page, the remaining space becomes internal fragmentation and cannot be used by other rows. Fragmentation can arise from updates that shrink column values, from reserved space for future growth, or from page‑level overhead.
Metadata Impact
Record header : 5‑7 bytes per row.
NULL‑value list : 1 bit per nullable column, rounded up to whole bytes.
Variable‑length field list : 1 byte if the column’s maximum byte length ≤ 255; otherwise 1 byte for lengths ≤ 127 and 2 bytes for larger lengths.
These overheads reduce the effective payload per row.
Overflow Columns (Dynamic Row Format)
When a column’s data exceeds the space available in the page, InnoDB stores the column off‑page (overflow). The row then contains a 20‑byte pointer to the overflow page. Example:
CREATE TABLE big_data (
id INT AUTO_INCREMENT PRIMARY KEY,
data LONGBLOB
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 ROW_FORMAT=DYNAMIC; INSERT INTO big_data (data) VALUES (REPEAT('a', 17000));The 17 KB value is stored off‑page; the row only keeps a small pointer, keeping the page size manageable.
Key Takeaways
InnoDB reads/writes whole 16 KB pages, making the first access slower than subsequent cached accesses.
Varchar length is stored in a variable‑length list; up to two bytes are used, limiting the maximum byte count to 65 535.
With utf8mb4, the practical character limit is 16 383, but metadata and fragmentation further reduce usable space.
Dynamic row format stores large column values off‑page, reducing page‑level fragmentation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
