Understanding MySQL Memory Management and When to Shard Tables
The article explains how MySQL uses different memory areas such as Thread Memory, Sharing, and InnoDB Buffer Pool, describes the impact of large user tables on query performance, and shows that when a table size exceeds the InnoDB buffer pool capacity, sharding becomes necessary.
Introduction
Using a user table of a social platform as an example, when the table reaches tens of millions of rows, queries that should use indexes become very slow, prompting the need to consider table sharding before performance bottlenecks appear.
Sharding Example
Querying 100 female users aged 18‑24 on a single user table is simple:
SELECT * FROM user WHERE age >= 18 AND age <= 24 AND sex = 0 LIMIT 100After sharding the table into user_1, user_2, user_3, the query process becomes:
Iterate over user_1 to user_3.
Execute the same SQL on each shard.
Merge the results.
Filter the first 100 records.
This larger process forces code changes and can cause significant business‑logic modifications if many similar queries exist, so planning sharding early is essential.
Why MySQL Needs to Reduce Disk I/O
MySQL adds an in‑memory layer between disk and the application to cache most data, allowing subsequent queries to be served from memory and avoiding heavy disk I/O.
Memory Management
MySQL’s memory is divided into three parts:
Thread Memory : private to each connection thread, dynamic.
Sharing : shared among all connection threads.
InnoDB Buffer Pool : shared memory used by the InnoDB engine, relatively static.
Thread Memory and Sharing belong to the MySQL Server layer, while the InnoDB Buffer Pool belongs to the InnoDB layer.
Thread Memory Components
Thread stack
sort_buffer
join_buffer
read_buffer
read_rnd_buffer
net_buffer
bulk_insert_buffer
tmp_table
Sharing Components
Key Buffer (MyISAM index cache)
Thread Cache
InnoDB Log Buffer
Query Cache
Table Cache
BinLog Buffer
Table Definition Cache
InnoDB Additional Memory Pool
InnoDB Buffer Pool Structures
Index Page / Data Page
Lock
Dictionary
AHI
Change Buffer
LRU List / Free List / Flush List
Linux Memory Structure
On Linux, MySQL’s process memory resides in user space, which consists of stack, file‑mapped area, heap, data segment, and read‑only segment. The article shows 32‑bit and 64‑bit address layouts.
Memory Allocation in MySQL
MySQL uses the C library malloc() to allocate heap memory and mmap() for file‑mapped memory. The Server layer calls malloc, which invokes a memory allocator that ultimately uses brk for small allocations (< MMAP_THRESHOLD ≈ 128 KB) and mmap for larger ones.
When brk is used, freed memory is not returned to the kernel immediately, leading to fragmentation. mmap returns memory to the kernel promptly, avoiding OOM but causing page‑fault overhead.
Memory Allocator
The allocator manages a memory pool, reducing fragmentation. Common allocators include ptmalloc, tcmalloc, and jemalloc.
Page Faults
If a virtual page is not present, the MMU triggers a page‑fault exception; the kernel swaps in a page from disk, updates the page table, and resumes execution.
Server Layer Allocation Process
MySQL calls malloc to request memory.
The allocator calls brk or mmap based on size.
When a page fault occurs, the kernel’s virtual‑memory manager handles it.
InnoDB Layer Allocation Process
The InnoDB Buffer Pool is allocated directly with mmap because it stores large structures (index trees, change buffers) that should not permanently occupy memory.
Conclusion
Server‑layer memory (Thread Memory and Sharing) is allocated with malloc; InnoDB‑layer memory (Buffer Pool) is allocated with mmap and managed via Free, LRU, and Flush lists. The decision to shard a table should be based on whether the table size exceeds the configured innodb_buffer_pool_size, because only data that fits in the buffer pool can be served efficiently from memory.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
