Databases 8 min read

Can MySQL Handle a 100 GB Full Table Scan Without Crashing?

This article explains why a MySQL query that scans a 100‑gigabyte table and returns millions of rows does not exhaust server memory, describing the net_buffer mechanism, socket send buffer behavior, InnoDB buffer‑pool management, and the improved LRU algorithm used to keep large scans from degrading overall performance.

JavaEdge
JavaEdge
JavaEdge
Can MySQL Handle a 100 GB Full Table Scan Without Crashing?

Can MySQL select tens of millions of rows from a 100 GB table without crashing?

Assume we run a full‑table scan on a 100 GB table t and redirect the result to a client file:

mysql -h $host -P $port -u $user -p$pwd -e "select * from t" > $target_file

Where does the result set live?

MySQL reads rows and writes them into an internal net_buffer, whose size is controlled by the net_buffer_length variable (default 16 KB). The process repeats:

Read a row, write it to net_buffer.

When the buffer is full, send it to the client.

After a successful send, clear the buffer and continue with the next row.

If the socket send buffer is full (EAGAIN/WSAEWOULDBLOCK), MySQL pauses until the network stack can accept more data.

The server never holds the entire result set in memory; the maximum memory used is the size of net_buffer_length. Likewise, the socket send buffer cannot hold 100 GB, so a slow client merely slows down the query execution.

Consequently, a full‑table scan does not fill MySQL’s memory, but a slow client can make the transaction run for a long time.

How InnoDB Handles Full Table Scans

Data pages are cached in the InnoDB Buffer Pool (BP), which speeds up queries. When a transaction commits, the on‑disk pages are older, but the in‑memory pages are up‑to‑date, so InnoDB reads directly from memory without applying redo logs.

The effectiveness of the Buffer Pool depends on its hit rate. You can check the current hit rate with: show engine innodb status For production systems the BP hit rate should stay above 99 %.

The Buffer Pool size is set by innodb_buffer_pool_size and is typically configured to 60‑80 % of the available physical memory.

InnoDB Memory Management and LRU

InnoDB uses a Least Recently Used (LRU) algorithm to evict pages that have not been accessed recently. A naïve LRU would evict pages from the Buffer Pool during a large historical‑data scan, harming the hit rate for active workloads.

To avoid this, InnoDB employs a modified LRU that splits the list into a "New" region and an "Old" region with a 5:3 ratio. The algorithm works as follows:

When a page in the New region is accessed, it is moved to the head of the list (standard LRU behavior).

When a page not currently in the list is accessed, the tail page is evicted, and the newly inserted page is placed in the Old region.

Pages in the Old region are examined on each access:

If the page has been in the LRU chain for more than 1 second (controlled by innodb_old_blocks_time), it is moved to the head of the list.

If it has been in the chain for less than 1 second, its position remains unchanged.

This strategy ensures that during a massive scan, newly read pages are placed in the Old region, leaving the New region (which serves hot data) untouched. As a result, the Buffer Pool continues to serve regular business queries with a high hit rate while the scan proceeds.

Images illustrating the data flow and the improved LRU algorithm:

MySQL net buffer flow
MySQL net buffer flow
Improved LRU diagram
Improved LRU diagram

References:

https://cloud.tencent.com/developer/article/1767570

https://juejin.cn/post/6854573221258199048

https://time.geekbang.org/column/article/79407

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

InnoDBmysqlLRUbuffer poolFull Table Scan
JavaEdge
Written by

JavaEdge

First‑line development experience at multiple leading tech firms; now a software architect at a Shanghai state‑owned enterprise and founder of Programming Yanxuan. Nearly 300k followers online; expertise in distributed system design, AIGC application development, and quantitative finance investing.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.