Databases 14 min read

Managing Large MySQL Tables: Evaluation, Root Causes, and Optimization Strategies

This article explains how to assess MySQL table size, why massive tables degrade performance due to B+‑tree depth, and presents three practical solutions—table partitioning, database sharding, and hot‑cold data archiving—along with guidance on selecting the appropriate approach.

Code Ape Tech Column

Feb 22, 2023

Managing Large MySQL Tables: Evaluation, Root Causes, and Optimization Strategies

When a MySQL table grows to tens of millions or even billions of rows, insert and query latency increase, schema changes become costly, and only a subset of data (e.g., recent time windows) is often needed.

Evaluating Table Size – Consider table capacity, disk usage, and instance capacity. For OLTP tables, keep rows under 20 million and total size under 15 GB, with read/write throughput below 1,600 ops/s. Common queries include:

select count(*) from table;

show table status like 'table_name'\G;

Disk usage can be inspected via information_schema.tables:

select table_schema as 'Database',
       table_name   as 'Table',
       table_rows   as 'Rows',
       truncate(data_length/1024/1024,2) as 'Data_MB',
       truncate(index_length/1024/1024,2) as 'Index_MB'
from information_schema.tables
order by data_length desc, index_length desc;

The recommendation is to keep disk usage below 70 % and consider archiving fast‑growing data to slower storage.

Root Cause of Slow Queries – As table size grows, the B+‑tree index height increases, leading to more disk I/O per lookup. InnoDB pages are 16 KB; a leaf node stores about 16 rows of 1 KB each, while internal nodes store pointers (≈14 B). A height‑2 B+‑tree can hold ~18 k rows, height‑3 about 22 million rows, explaining why millions of rows already stress the index.

Optimization Solutions

Solution 1: Table Partitioning

Partitioning splits a logical table into multiple physical files based on a range or list, reducing the scan range and improving index hit rate. Benefits include easier data deletion, better parallel aggregation, and higher I/O throughput. Limitations: max 1,024 partitions, partition key must include primary/unique keys, no foreign keys, and partitions apply to both data and indexes.

Check partition support:

show variables like '%partition%';

Solution 2: Database Sharding (Horizontal/Vertical)

Sharding reduces per‑table row count, lowering B+‑tree height and I/O. Horizontal sharding distributes rows across multiple tables or databases (e.g., modulo 4). Vertical sharding separates rarely used columns into another table. Common strategies:

Modulo: user_id % 4 determines target table.

Range: rows within a numeric range go to a specific table.

Combined hash‑plus‑range for balanced load and future scalability.

Solution 3: Hot‑Cold Data Archiving

Identify hot data (e.g., last week or month) and keep it in the primary table, while moving older (cold) data to archive tables or separate databases, thus keeping the active table small and fast.

Choosing a Strategy – Use partitioning for simple range queries, sharding when write concurrency is high, and archiving when data naturally separates by time. Often a combination yields the best results.

Finally, the author includes promotional links and encourages readers to like, share, and follow the “码猿技术专栏” public account for additional resources.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization Sharding MySQL Large Tables database partitioning Cold Archiving

Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Solution 1: Table Partitioning

Solution 2: Database Sharding (Horizontal/Vertical)

Solution 3: Hot‑Cold Data Archiving

Code Ape Tech Column

How this landed with the community

Was this worth your time?

0 Comments

Solution 1: Table Partitioning

Solution 2: Database Sharding (Horizontal/Vertical)

Solution 3: Hot‑Cold Data Archiving