Databases 20 min read

Why MySQL Single-Table Data Should Stay Under 20 Million Rows

This article explains how InnoDB stores data in 16 KB pages, how B+Tree indexes work, derives the formula for the maximum number of rows a single MySQL table can hold, and why the practical limit of about twenty million rows exists, along with performance implications and optimization tips.

DeWu Technology
DeWu Technology
DeWu Technology
Why MySQL Single-Table Data Should Stay Under 20 Million Rows

Introduction

The industry often says “MySQL single‑table data should not exceed ten million rows.” This guideline assumes the InnoDB storage engine, the default B+Tree index structure, and a normal table with a moderate number of columns and typical field types.

How MySQL Stores Data

InnoDB stores data in fixed‑size 16 KB pages. Each page contains a header, a footer, a page directory, and the actual records. The records are kept in the table’s .ibd file as a sequence of pages.

Usable space per page is roughly 15 KB after accounting for header, footer and directory.

Each row is called a record .

Pages are linked by a page number and previous/next pointers, allowing sequential traversal.

Page structure diagram
Page structure diagram

From Page to Index – B+Tree Index

Leaf pages store the smallest primary‑key id of each page; non‑leaf pages store the key together with a pointer to the child page, forming a B+Tree. The tree height is usually two or three levels for tables up to tens of millions of rows.

Clustered index : data are stored in the leaf nodes.

Secondary index : leaf nodes store only the key, requiring a lookup to the clustered index (a “back‑lookup”).

Row format : e.g., COMPACT stores transaction ID, rollback pointer, column values, etc.

How MySQL Queries Data

Example query: SELECT * FROM user WHERE id = 5; The engine starts at the root page, binary‑searches non‑leaf pages to locate the leaf page that contains id = 5, then reads that leaf page to fetch the row. The I/O cost equals the tree height (typically 1–3 disk reads).

Query steps summary:

Load root page (often cached in the buffer pool).

Iteratively locate the child page by binary search on non‑leaf pages.

When the leaf page is reached, perform a binary search on the records to find the exact row.

I/O count equals the number of levels traversed.

Why the 20 Million Row Upper Bound?

The maximum rows per table can be approximated by the formula: (fanout ^ (height‑1)) × rows_per_leaf_page For a 16 KB page:

Fanout ≈ 1280 (each index entry is about 12 B; 15 KB usable → 15 000 / 12 ≈ 1280).

Rows per leaf page ≈ 15 when each row is about 1 KB.

Two‑level tree: (1280 ^ 1) × 15 ≈ 20 k rows (≈ 20 million). Three‑level tree: (1280 ^ 2) × 15 ≈ 2.5 M rows, which is the commonly cited “2 kw” (≈ 20 million) limit. When the row count exceeds this, the tree height grows to four, increasing I/O from three reads to four and causing a noticeable performance cliff.

Core Principle Summary

All data are organized by a B+Tree; pages store records; non‑leaf pages provide fast navigation; leaf pages store the actual rows. High fan‑out keeps the tree shallow, limiting the number of disk I/O operations required for a lookup.

Extended Questions

Why 16 KB Page Size?

16 KB balances memory consumption, disk I/O, and index density. Larger pages would reduce the number of pages that fit in memory; smaller pages would increase the tree height and the number of I/O operations.

How to Index Strings?

MySQL uses B+Tree for string columns, sorting by lexicographic order. For long strings, consider prefix indexes or full‑text indexes. For Chinese strings, you can use a pinyin plugin or Unicode collation.

Is There a Limit on Index Length?

In InnoDB the maximum index length is 767 bytes (unless a prefix index is used). With utf8mb4, each character may occupy up to 4 bytes, so a VARCHAR(255) can exceed the limit unless a shorter prefix is specified.

Optimization Suggestions

Keep row size small; avoid storing large TEXT / BLOB columns in the main table.

Plan sharding or partitioning when approaching the 10 M‑20 M row range.

Archive cold data to separate tables or storage.

Data insertion example
Data insertion example
B+Tree page levels
B+Tree page levels
B+Tree leaf search
B+Tree leaf search

Conclusion

MySQL’s 16 KB page size and B+Tree index design strike a balance between disk I/O and memory usage. When a table grows beyond the practical 20 million‑row threshold, the B+Tree height increases, leading to extra I/O and a performance cliff. Proper row‑size control, sharding, and archiving can keep the system performant.

PerformanceInnoDBMySQLB+Treedatabase indexingPage Size
DeWu Technology
Written by

DeWu Technology

A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.