Optimizing Large-Scale Pagination Queries in MySQL
When paginating millions of rows in MySQL, avoid large OFFSET scans by using delayed joins or bookmark techniques that first fetch primary‑key values via covering indexes, then retrieve the needed rows, dramatically reducing I/O and query latency.
Most developers and DBAs are familiar with pagination queries for page navigation or product search. When the data volume reaches tens of millions or billions, fetching the whole set efficiently becomes a challenge. This article shares practical experience on optimizing large‑scale pagination in MySQL.
1. Background
Typical scenarios include merchants pulling millions of orders each month for financial statistics or public accounts pushing messages to millions of followers. The goal is to retrieve the required rows quickly without scanning the entire table.
2. Analysis
A common mistake is using a plain
SELECT * FROM table WHERE kid=1342 AND type=1 ORDER BY id ASC LIMIT 149420,20;pattern. MySQL must scan the first 149,420 rows before it can return the next 20, which leads to heavy I/O and long latency when the offset (N) is large.
order by col limit N, M
For InnoDB, the secondary index (e.g., idx_kid_type(kid,type)) contains the primary key values. When N is large, MySQL traverses many index pages and then fetches the corresponding data rows, causing random I/O.
To avoid scanning “useless” rows, we can directly locate the desired rows via the index and then fetch the data – the core idea of delayed join (covering index + primary‑key lookup) .
3. Practical Solutions
Front‑end caching or search services to reduce database hits (e.g., using a search engine and infinite scroll).
SQL‑level optimizations that locate the target rows quickly.
Bookmark method: store the last maximum (or minimum) primary‑key value and query the next batch based on that value. This method often uses delayed join.
3.1 Delayed Join Example (Before Optimization)
root@xxx:12:33:48> explain SELECT id, cu_id, name, info, biz_type, gmt_create, gmt_modified, start_time, end_time, market_type, back_leaf_category, item_status, picuture_url FROM relation WHERE biz_type='0' AND end_time>='2014-05-29' ORDER BY id ASC LIMIT 149420,20;The execution plan shows a full range scan with Using where; Using filesort and processes 349,622 rows, taking several seconds.
3.2 Delayed Join Example (After Optimization)
root@xxx:12:33:43> explain SELECT a.* FROM relation a, (SELECT id FROM relation WHERE biz_type='0' AND end_time>='2014-05-29' ORDER BY id ASC LIMIT 149420,20) b WHERE a.id=b.id;The optimized plan uses a derived table to fetch only the needed primary keys, then performs an eq_ref lookup, reducing the rows examined to a few and cutting execution time to about one‑third of the original.
3.3 Bookmark Method
First obtain the maximum and minimum id for the filtered set:
SELECT MAX(id) AS maxid, MIN(id) AS minid FROM t WHERE kid=2333 AND type=1;
Then paginate by comparing id against the stored bookmark:
SELECT xx, xx FROM t WHERE kid=2333 AND type=1 AND id>=min_id ORDER BY id ASC LIMIT 100; SELECT xx, xx FROM t WHERE kid=2333 AND type=1 AND id<=max_id ORDER BY id DESC LIMIT 100;
When delayed join still does not meet performance requirements, the bookmark approach can reduce query time from ~510 ms to under 10 ms.
4. Summary
Locating rows directly via the primary key (or a covering index) is generally faster than scanning large offsets. However, there is no universal “silver bullet”; the best method depends on the specific schema and workload. Experiments show that ORDER BY id DESC can be ~70 ms faster than ORDER BY id ASC, and in production a LIMIT 100 query may differ by 1.3 s.
Other techniques, such as using ICP (Index Condition Pushdown) with full covering indexes, can also accelerate large pagination queries.
The author encourages readers to share their own optimization experiences.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Youzan Coder
Official Youzan tech channel, delivering technical insights and occasional daily updates from the Youzan tech team.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
