How to Speed Up Large‑Scale MySQL Pagination: Practical Tips & Benchmarks
This article explains why traditional MySQL pagination (ORDER BY … LIMIT N,M) becomes painfully slow on tens of millions of rows, analyzes the root causes, and presents concrete optimizations such as covering‑index delayed joins, bookmark‑based scans, and query rewrites that can cut execution time to a third of the original.
Background
Developers and DBAs are familiar with pagination for UI and search, but when faced with tens of millions or billions of rows, fetching the whole dataset efficiently becomes a challenge. Typical scenarios include large merchants extracting millions of orders for financial reporting or mass messaging to millions of followers.
Analysis
A common mistake is using a query like:
SELECT * FROM table WHERE kid=1342 AND type=1 ORDER BY id ASC LIMIT 149420,20;This pattern forces MySQL to scan the first 149,420 rows before returning the next 20, which incurs heavy I/O as the offset grows. The underlying InnoDB secondary index stores primary keys separately from data pages, so large offsets cause many random reads.
The core idea of a faster approach is to avoid scanning irrelevant rows by using a covering index to fetch only the needed primary keys and then join back to the table – a technique often called a "delayed join".
Practical Optimizations
Three main strategies are discussed:
Front‑end caching or search services to reduce database hits.
SQL rewrite to locate rows directly via indexes.
Bookmark method: record the maximum/minimum id of the filtered set and paginate by id ranges.
3.1 Delayed Join
Before optimization, an EXPLAIN shows a costly filesort and large row count. After applying a covering‑index delayed join, the plan uses a simple range scan on the primary key, dramatically reducing rows examined.
EXPLAIN SELECT a.* FROM relation a
JOIN (SELECT id FROM relation WHERE biz_type='0' AND end_time>='2014-05-29' ORDER BY id ASC LIMIT 149420,20) b
ON a.id=b.id;Execution time dropped to about one‑third of the original.
3.2 Bookmark Method
First obtain the min and max id for the filtered condition:
SELECT MAX(id) AS maxid, MIN(id) AS minid FROM t WHERE kid=2333 AND type=1;
Then paginate by id without offset:
SELECT * FROM t WHERE kid=2333 AND type=1 AND id>=minid ORDER BY id ASC LIMIT 100; SELECT * FROM t WHERE kid=2333 AND type=1 AND id<=maxid ORDER BY id DESC LIMIT 100;
This approach can reduce query latency from hundreds of milliseconds to under 10 ms in production.
Conclusion
Using primary‑key‑based scans (bookmark) or covering‑index delayed joins is generally faster than traditional offset pagination because they avoid scanning large index ranges. However, there is no universal silver bullet; the best method depends on data distribution and query patterns. Other techniques such as index condition pushdown (ICP) may also help large‑scale pagination.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
