Databases 20 min read

Why MySQL Single‑Table Data Should Stay Below 10 Million Rows – B+Tree, 16KB Pages & Performance Secrets

This article explains how MySQL stores data in 16KB pages, uses B+‑tree indexes to locate rows, derives the commonly cited 20‑million‑row limit from page fan‑out and row size calculations, compares B‑tree and B+‑tree structures, and offers practical optimization tips.

Sanyou's Java Diary

Sep 15, 2025

Why MySQL Single‑Table Data Should Stay Below 10 Million Rows – B+Tree, 16KB Pages & Performance Secrets

Introduction

The core question is why the industry often advises that a MySQL single table should not exceed the ten‑million‑row range.

How MySQL Stores Data

Assumptions: InnoDB storage engine, default B+‑tree index, ordinary tables with a moderate number of columns and typical field types.

Data Page Structure

InnoDB’s smallest storage unit is a fixed 16KB page that holds table rows, indexes, and metadata. Pages are loaded into memory as whole units to reduce disk I/O.

Each table is stored in an .ibd file (tablespace). Although rows appear contiguous in the logical table, physically they are split across many 16KB pages.

Every page contains a header with a page number and forward/backward pointers, a checksum at the tail, and a page directory that enables binary search, turning linear scans (O(n)) into O(log n) lookups.

From Page to B+Tree Index

Pages are linked together forming a double‑linked list. By selecting the smallest primary‑key record on each page and storing its key and page number, a new upper‑level page is created, adding a page level field. This creates a hierarchical structure that resembles an inverted tree – the B+‑tree.

Leaf level (page level 0) stores full rows; non‑leaf levels store only primary‑key values and pointers to child pages. The fan‑out of a typical 16KB page is about 1,280 pointers (12 bytes per entry, ~15 KB usable space).

Storing Data Example

CREATE TABLE `user` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT COMMENT '主键',
  `name` varchar(100) NOT NULL DEFAULT '' COMMENT '名字',
  `age` int(11) NOT NULL DEFAULT '0' COMMENT '年龄',
  PRIMARY KEY (`id`),
  KEY `idx_age` (`age`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

When inserting a new row, MySQL loads the relevant page, checks the key range, and writes the record into the appropriate page if space permits.

How MySQL Queries Data

To find a row (e.g., SELECT * FROM user WHERE id = 5), MySQL starts at the root page, follows pointers using binary search, and reaches the leaf page containing the target row. Typically only 1–3 pages are accessed, resulting in 1–3 disk I/Os.

Query Steps Summary

Load root page (often cached).

Binary‑search non‑leaf pages to locate the child page range.

Repeat until leaf page is reached.

Binary‑search within leaf page to find the exact row.

I/O count: tree height = 3 → up to 3 disk reads; height = 4 → 4 reads.

Calculating the 20 Million Row Limit

The formula for maximum rows in a B+‑tree is (fanout ^ (height‑1)) × rows_per_leaf_page. With a fan‑out of ~1,280 and ~15 rows per leaf (1 KB per row), a two‑level tree holds ~2 × 10⁴ rows, while a three‑level tree holds ~2.5 × 10⁵ rows, explaining the “2 kw” (≈20 million) recommendation.

If rows are smaller (≈250 bytes), a leaf can hold ~60 rows, and a three‑level tree can store about 1 billion rows without exceeding the three‑I/O limit.

Fan‑out Calculation

Each non‑leaf entry stores an 8‑byte BIGINT primary key plus a 4‑byte page number (~12 bytes). After subtracting ~1 KB for headers and directory, ~15 KB remain, yielding roughly 1,280 pointers per page.

Row Capacity per Leaf

Assuming 1 KB per row, a leaf page holds about 15 rows; with 250 bytes per row, it holds about 60 rows.

B‑Tree vs B+‑Tree

B‑tree stores rows in both leaf and non‑leaf nodes, drastically reducing fan‑out (≈15 pointers per page) and increasing tree height. To store ~20 million rows, a B‑tree would need ≥6 levels, causing up to 6 disk I/Os per query, whereas a B+‑tree needs only 2–3 levels.

Optimization Suggestions

Keep row size small; avoid large TEXT/BLOB columns.

Consider sharding or partitioning when a table approaches the ten‑million‑row range.

Archive cold data to separate tables.

Extended Questions

Why 16KB Page Size?

16KB balances memory usage, disk I/O, and index depth: larger pages waste memory and increase I/O; smaller pages increase tree height.

How to Index Strings?

MySQL can index VARCHAR columns using B+‑tree with lexicographic ordering. For long strings, use prefix indexes or full‑text indexes. For Chinese strings, either use a pinyin parser plugin or the utf8mb4_unicode_ci collation.

INSTALL PLUGIN pinyin SONAME 'ha_pinyin.so';

CREATE INDEX idx_name_pinyin ON mytable(name) USING BTREE WITH PARSER pinyin;

CREATE INDEX idx_name_unicode ON mytable(name) USING BTREE;

Is There a Limit on Index Length?

In InnoDB the maximum index length is 767 bytes (prefix indexes exempt). For utf8mb4 each character can take up to 4 bytes, so a VARCHAR(255) may exceed the limit unless a prefix is used.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance indexing MySQL B+Tree Database Storage

Written by

Sanyou's Java Diary

Passionate about technology, though not great at solving problems; eager to share, never tire of learning!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Introduction

How MySQL Stores Data

Data Page Structure

From Page to B+Tree Index

Storing Data Example

How MySQL Queries Data

Query Steps Summary

Calculating the 20 Million Row Limit

Fan‑out Calculation

Row Capacity per Leaf

B‑Tree vs B+‑Tree

Optimization Suggestions

Extended Questions

Why 16KB Page Size?

How to Index Strings?

Is There a Limit on Index Length?

Sanyou's Java Diary

How this landed with the community

Was this worth your time?

0 Comments

Calculating the 20 Million Row Limit