Why MySQL Single Tables Should Stay Below 20 Million Rows: Theory & Practice
This article explains the technical reasons why a MySQL single table should not exceed about 20 million rows, covering limits from auto‑increment primary keys, data‑page structures, B+‑tree storage calculations, and practical partitioning recommendations for large‑scale applications.
01 Auto‑increment Primary Key Perspective
When the primary key is an int, its maximum value is 2³²‑1 (about 2.1 billion). If the primary key is a bigint, the theoretical limit is 2⁶⁴‑1, which is far beyond what disk can hold. A tinyint caps at 255, so inserting an ID of 256 would fail.
02 Data Page Perspective
In InnoDB, a table is stored in an .ibd file that is divided into 16 KB data pages. Each page contains a page number, forward/backward pointers, a checksum, a page directory, and the actual row data.
The pages are linked together using a B+‑tree. Leaf nodes store the actual rows, while non‑leaf nodes store index entries. Each level of the B+‑tree corresponds to one disk I/O.
To estimate capacity, assume a data page can store roughly 15 KB of row data after accounting for metadata. With a bigint primary key (8 bytes) and a 4‑byte page number, each page can hold about 1 280 pointers (X = 15 KB / (8 + 4)). If each row occupies 1 KB, a page can store 15 rows (Y = 15).
- For a B+‑tree of height N, the number of leaf nodes is X^(N‑1).
- Total rows M = X^(N‑1) * Y.Applying the formula:
- Height N=2 (two‑level tree): M ≈ 1 280^(1) * 15 ≈ 19 200 rows.
- Height N=3: M ≈ 1 280^(2) * 15 ≈ 2.5 million rows.
- Height N=4: M ≈ 1 280^(3) * 15 ≈ 3 billion rows.In practice, with typical row sizes around 1 KB, a safe guideline is to keep a single MySQL table under 20 million rows to avoid performance degradation; this figure is a recommendation, not a strict rule.
03 Thought Exercise
Question: How many rows can a four‑level B+‑tree store if the primary key is bigint and each row is 1 KB?
Answer: Using X = 1 280 and Y = 15, the total rows M = 1 280^(4‑1) * 15 = 1 280³ * 15, which yields roughly 3 billion rows.
— End of article —
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
