How to Speed Up Large Pagination Queries in MySQL: Delayed Join and Bookmark Techniques
This article explains why traditional ORDER BY … LIMIT pagination becomes slow on tens of millions of rows, analyzes the underlying index scan cost, and presents two practical optimizations—delayed join using covering indexes and a bookmark‑based approach—showing how they can cut execution time to a third or even a tenth of the original.
Background
Developers and DBAs are familiar with simple pagination, but when tables contain tens of millions or billions of rows, retrieving full result sets quickly becomes a challenge, such as pulling millions of orders for financial reporting or pushing messages to millions of followers.
Analysis of the Common Mistake
A typical query uses ORDER BY col LIMIT N,M. MySQL must scan the first N rows before returning the next M rows, causing the scan cost to grow linearly with N . The example below demonstrates this pattern:
SELECT * FROM table WHERE kid=1342 AND type=1 ORDER BY id ASC LIMIT 149420,20;Because the secondary index (kid, type) stores primary keys that are not physically ordered with the data rows, MySQL performs many random I/O operations when N is large, leading to high latency.
Practical Optimizations
3.1 Delayed Join (Covering Index)
Instead of fetching full rows via the secondary index, first retrieve only the primary keys using a covering index, then join back to the base table for the required columns. This reduces the amount of data scanned.
EXPLAIN SELECT a.* FROM relation a, (SELECT id FROM relation WHERE biz_type='0' AND end_time>='2014-05-29' ORDER BY id ASC LIMIT 149420,20) b WHERE a.id = b.id;The execution plan shows the derived table using the primary key, eliminating the costly filesort.
After applying the delayed join, the query time dropped to roughly one‑third of the original.
3.2 Bookmark Method
First obtain the minimum and maximum id that satisfy the filter conditions, then iterate forward or backward from those bookmarks, avoiding the need to scan irrelevant rows.
SELECT MAX(id) AS maxid, MIN(id) AS minid FROM t WHERE kid=2333 AND type=1; SELECT col1, col2 FROM t WHERE kid=2333 AND type=1 AND id >= minid ORDER BY id ASC LIMIT 100; SELECT col1, col2 FROM t WHERE kid=2333 AND type=1 AND id <= maxid ORDER BY id DESC LIMIT 100;In a real‑world case, the delayed join took about 510 ms, while the bookmark approach reduced the time to under 10 ms, a dramatic improvement.
Conclusion
Fetching data by directly locating primary‑key positions (either via delayed join or bookmark pagination) is generally faster than scanning large secondary‑index ranges. However, no single technique solves every scenario; the choice depends on data distribution, index design, and query patterns. Additional tricks such as using ICP (Index Condition Pushdown) with full‑covering indexes can further accelerate large‑page queries.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
