How to Solve MySQL Deep Pagination Performance Issues
This article analyzes why large OFFSET values cause severe MySQL performance degradation and presents multiple optimization techniques—including range queries, subqueries, delayed joins, covering indexes, sharding, caching, and search engine integration—along with their advantages, limitations, and practical recommendations.
Understanding Deep Pagination in MySQL
Deep pagination occurs when the OFFSET value in a LIMIT clause becomes very large. MySQL then has to scan and skip millions of rows, leading to full‑table scans, high I/O and CPU consumption.
Root Causes
Traditional pagination bottlenecks
Typical query:
SELECT * FROM t_order ORDER BY id LIMIT 1000000, 10;Full table scan : optimizer may abandon the index.
High I/O : skipping many rows consumes disk I/O.
High CPU : sorting and filtering many rows.
Why large offsets are costly
The optimizer chooses the cheapest plan. Scanning an index and discarding millions of entries can be more expensive than a full scan, and a back‑table lookup is still required.
Optimization Strategies
1. Range query (ID continuity)
Principle : Record the last id of the previous page and query the next page by a range condition.
SELECT * FROM t_order WHERE id > 100000 AND id <= 100010 ORDER BY id;Advantages
Direct index lookup, minimal rows scanned.
No OFFSET calculation.
Limitations
Requires monotonically increasing IDs; deletions or rollbacks break continuity.
Only works when ordering by the primary key.
Concurrent inserts can cause duplicates or gaps.
2. Subquery optimization
Principle : First fetch the starting primary‑key value with a subquery, then filter the main query.
SELECT * FROM t_order
WHERE id >= (SELECT id FROM t_order WHERE id > 1000000 LIMIT 1)
LIMIT 10;Workflow
Subquery quickly finds the first ID (e.g., 1000001).
Main query uses that ID to limit the scan range.
Advantages
Uses primary‑key index for fast location.
Avoids full table scan.
Limitations
Subquery may create a temporary table.
Only suitable for ascending ID order.
Complex multi‑condition pagination may not apply.
3. Delayed join (INNER JOIN)
Principle : Perform pagination in a subquery that returns only IDs, then join back to the main table.
SELECT t1.* FROM t_order t1
INNER JOIN (SELECT id FROM t_order WHERE id > 1000000 LIMIT 10) t2
ON t1.id = t2.id;Workflow
Subquery fetches the 10 target IDs using the primary‑key index.
INNER JOIN retrieves the full rows.
Advantages
No temporary table generated by the subquery.
Efficient index join reduces back‑table lookups.
Usually faster than a plain subquery.
4. Covering index
Principle : Create an index that contains all columns needed by the query, eliminating the need for a back‑table lookup.
SELECT id, code, type FROM t_order
ORDER BY code
LIMIT 1000000, 10;Benefits
All required fields are in the index, so no extra row fetch.
Sequential I/O on the index reduces random I/O.
Applicable scenarios
Result set contains a small, fixed set of columns.
Sorting column matches the index.
Limitations
Large result sets may cause MySQL to ignore the index.
Additional storage is required for the covering index.
Comparison and Recommendation
Range queries are simplest and fastest but need continuous IDs. Subqueries reduce the scan range but may generate temporary tables. Delayed joins avoid temporary tables and generally outperform subqueries, at the cost of added query complexity. Covering indexes give the best performance when applicable but require pre‑built indexes and may fail on large result sets. Choose the method that matches data distribution and workload.
Practical Considerations
In high‑concurrency environments, handle possible duplicate or missing rows when using ID‑based pagination (e.g., also record a timestamp).
Design indexes carefully, balancing storage cost against query speed.
Use FORCE INDEX only when you fully understand its impact.
For complex pagination, consider redesigning the logic (e.g., time‑range paging).
Advanced Strategies
5.1 Sharding / Partitioning
Horizontal sharding splits a large table into multiple physical tables or databases, reducing the per‑shard row count.
Implementation options
ID hash sharding.
Time‑range sharding (e.g., order creation date).
Consistent hashing to avoid hot spots during data migration.
Pros
Linear performance improvement as each shard holds fewer rows.
Supports horizontal scaling.
Cons
Increases system complexity; requires cross‑shard query handling.
Needs routing logic and data‑sync mechanisms.
5.2 Caching intermediate results
Cache frequently accessed page data or offset IDs in Redis or Memcached to reduce database load.
Pros
Significantly lowers query frequency.
Improves response time.
Cons
Not suitable for real‑time data requirements.
Cache expiration must be designed to avoid stampedes.
5.3 Using a search engine
Offload pagination to a search engine such as Elasticsearch, which natively supports efficient deep pagination and complex filters.
Implementation steps
Synchronize MySQL data to the search engine via scheduled jobs or message queues.
Forward pagination requests to the search engine.
Pros
Handles complex queries and sorting effortlessly.
Deep pagination performance is excellent for massive data sets.
Cons
Adds system complexity and maintenance cost.
Data synchronization latency must be considered for real‑time needs.
Conclusion
Deep pagination degrades performance because a large OFFSET forces MySQL to scan and discard many rows, incurring high I/O and CPU costs. By redesigning pagination (range query, subquery, delayed join, covering index) and, when necessary, applying architectural optimizations (sharding, caching, search engines), you can achieve substantial performance gains. Select the technique that aligns with your workload and be aware of its trade‑offs.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Cognitive Technology Team
Cognitive Technology Team regularly delivers the latest IT news, original content, programming tutorials and experience sharing, with daily perks awaiting you.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
