Optimizing MySQL Pagination Queries with Subqueries and Buffer‑Pool Analysis
The article explains why large‑offset LIMIT queries on a MySQL table with millions of rows are slow, demonstrates how rewriting the query to use a subquery that first selects primary‑key IDs dramatically reduces execution time, and validates the improvement by measuring buffer‑pool page usage.
This article starts with a real‑world example where a financial ledger table containing 9.5 million rows is queried with SELECT * FROM test WHERE val=4 LIMIT 300000,5 , taking over 16 seconds because MySQL must scan 300,005 index entries and corresponding clustered rows before discarding the first 300,000 rows.
The author proposes an optimization: move the LIMIT clause into a subquery that only selects the primary‑key IDs, then join the original table on those IDs. The rewritten query SELECT a.* FROM test a INNER JOIN (SELECT id FROM test WHERE val=4 LIMIT 300000,5) b ON a.id=b.id reduces execution time to under 0.5 seconds.
The underlying principle is to avoid the costly "row‑by‑row" random I/O caused by large offsets. By first locating the needed IDs, MySQL performs far fewer index lookups and clustered‑index accesses.
To prove the hypothesis, the author runs experiments measuring InnoDB buffer‑pool page usage. After executing the original slow query, the buffer pool contains thousands of data pages (e.g., 4,098) and hundreds of index pages. After executing the optimized subquery‑join, only five data pages are loaded, confirming the reduction in unnecessary I/O.
The article also discusses how large‑offset queries can pollute the buffer pool, reducing cache efficiency for hot data. It recommends disabling innodb_buffer_pool_dump_at_shutdown and innodb_buffer_pool_load_at_startup to ensure a clean buffer pool on each MySQL restart.
Finally, the author provides a summary of the experiment steps, including restarting MySQL, clearing the buffer pool, and comparing the buffer‑pool statistics before and after each query.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.