Optimizing Deep Pagination in MySQL: Data Generation, Index Behavior, and Fast Query Strategies
This article demonstrates how to generate two million MySQL rows, examines the performance impact of large OFFSET values in LIMIT clauses, explains why non‑clustered indexes cause back‑table lookups, and presents two optimization strategies—using sub‑queries to fetch IDs and employing key‑based pagination—to dramatically speed up deep page queries.
Without further ado, let's generate some data on the spot, creating 2 million rows.
Prepare a table:
Create a function to generate data:
<code style="padding: 16px; color: #abb2bf; display: -webkit-box; font-family: Operator Mono, Consolas, Monaco, Menlo, monospace; font-size: 12px"><span style="color: #c678dd; line-height: 26px">CREATE</span> DEFINER=`mytoor`@`%` <span style="color: #c678dd; line-height: 26px">FUNCTION</span> `JcTestData`() <span style="color: #c678dd; line-height: 26px">RETURNS</span> int(11)
<span style="color: #c678dd; line-height: 26px">BEGIN</span>
<span style="color: #c678dd; line-height: 26px">DECLARE</span> num INT <span style="color: #c678dd; line-height: 26px">DEFAULT</span> 2000000;
<span style="color: #c678dd; line-height: 26px">DECLARE</span> i INT <span style="color: #c678dd; line-height: 26px">DEFAULT</span> 0;
WHILE i < num <span style="color: #c678dd; line-height: 26px">DO</span>
INSERT INTO test_order(`platform_sn`,`third_sn`,`type`,`create_time`)
VALUES(CONCAT('SN',i),UUID(),1,now());
SET i = i + 1;
END WHILE;
RETURN i;
<span style="color: #c678dd; line-height: 26px">END</span>
</code>A brief note: those who understand can ignore this (learning is always after the fact).
Click to run:
It completes in 29 seconds, which is acceptable.
Data is ready, let's proceed:
Start the demonstration:
First, add an index on the type column to simulate a real query scenario:
Then modify the type values of a few rows:
Normal pagination query with limit 0, 50:
<code style="padding: 16px; color: #abb2bf; display: -webkit-box; font-family: Operator Mono, Consolas, Monaco, Menlo, monospace; font-size: 12px"><span style="color: #c678dd; line-height: 26px">select</span> * <span style="color: #c678dd; line-height: 26px">FROM</span> test_order <span style="color: #c678dd; line-height: 26px">where</span> type=1 <span style="color: #c678dd; line-height: 26px">limit</span> 0,50;</code>The query with limit 0,50 is fast, taking 0.022 seconds.
Now simulate a deep‑page query with limit 1200000,50:
<code style="padding: 16px; color: #abb2bf; display: -webkit-box; font-family: Operator Mono, Consolas, Monaco, Menlo, monospace; font-size: 12px"><span style="color: #c678dd; line-height: 26px">select</span> * <span style="color: #c678dd; line-height: 26px">FROM</span> test_order <span style="color: #c678dd; line-height: 26px">where</span> type=1 <span style="color: #c678dd; line-height: 26px">limit</span> 1200000,50;</code>This takes 3.765 seconds.
Explain output shows the index index_type is used, yet the query remains slow. Why?
Reason ①
① The index index_type is a non‑clustered index, while the query selects * , pulling additional columns. Using the non‑clustered index only returns type and id . To retrieve other columns, MySQL must first fetch the primary‑key id from the index, then look up the full row—a back‑table operation.
Reason ②
The LIMIT clause works by first calculating the total number of rows to scan (offset + row‑count), then scanning from the start up to the offset, discarding those rows, and finally returning the requested count.
Limit example explanation
For limit 0,50, MySQL scans 0 + 50 = 50 rows, returns the first 50 matching rows.
For limit 1200000,50, MySQL scans 1 200 000 + 50 = 1 200 050 rows, discards the first 1 200 000 rows, then returns the next 50 rows, which explains the higher cost.
Solution 1: Optimize back‑table lookups
Fetch the primary‑key IDs first, then retrieve the full rows using those IDs:
<code style="padding: 16px; color: #abb2bf; display: -webkit-box; font-family: Operator Mono, Consolas, Monaco, Menlo, monospace; font-size: 12px"><span style="color: #c678dd; line-height: 26px">select</span> * <span style="color: #c678dd; line-height: 26px">FROM</span> test_order <span style="color: #c678dd; line-height: 26px">where</span> id <span style="color: #c678dd; line-height: 26px">in</span>(
<span style="color: #c678dd; line-height: 26px">select</span> id <span style="color: #c678dd; line-height: 26px">from</span> (
<span style="color: #c678dd; line-height: 26px">select</span> id <span style="color: #c678dd; line-height: 26px">FROM</span> test_order <span style="color: #c678dd; line-height: 26px">where</span> type=1 <span style="color: #c678dd; line-height: 26px">limit</span> 1200000,50
) child
);</code>Execution time drops from 3.765 seconds to about 1.56 seconds.
Solution 2: Use minimum‑value condition (key‑set pagination)
Avoid large offsets by using the last id from the previous page as the starting point:
<code style="padding: 16px; color: #abb2bf; display: -webkit-box; font-family: Operator Mono, Consolas, Monaco, Menlo, monospace; font-size: 12px"><span style="color: #c678dd; line-height: 26px">select</span> * <span style="color: #c678dd; line-height: 26px">FROM</span> test_order <span style="color: #c678dd; line-height: 26px">where</span> type=1 <span style="color: #c678dd; line-height: 26px">and</span> id >= 1200008 <span style="color: #c678dd; line-height: 26px">limit</span> 50;</code>This runs in 0.022 seconds, the same speed as the first page.
Explain shows the primary‑key index PRIMARY is used.
Server‑side usage of Solution 2
When processing large batches, retrieve the last row’s id from the current batch and use it as the id >= condition for the next batch, effectively halving the query time.
Front‑end usage of Solution 2
In infinite‑scroll or waterfall‑style UI, pass the smallest id of the previous page as a parameter to load the next page, eliminating costly offset scans.
Thanks for reading, hope this helps :)
Source: blog.csdn.net/qq_35387940/article/details/125816667
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
