Databases 11 min read

How to Solve MySQL Deep Pagination Performance Issues

This article analyzes why large OFFSET values cause severe MySQL performance degradation and presents multiple optimization techniques—including range queries, subqueries, delayed joins, covering indexes, sharding, caching, and search engine integration—along with their advantages, limitations, and practical recommendations.

Cognitive Technology Team
Cognitive Technology Team
Cognitive Technology Team
How to Solve MySQL Deep Pagination Performance Issues

Understanding Deep Pagination in MySQL

Deep pagination occurs when the OFFSET value in a LIMIT clause becomes very large. MySQL then has to scan and skip millions of rows, leading to full‑table scans, high I/O and CPU consumption.

Root Causes

Traditional pagination bottlenecks

Typical query:

SELECT * FROM t_order ORDER BY id LIMIT 1000000, 10;

Full table scan : optimizer may abandon the index.

High I/O : skipping many rows consumes disk I/O.

High CPU : sorting and filtering many rows.

Why large offsets are costly

The optimizer chooses the cheapest plan. Scanning an index and discarding millions of entries can be more expensive than a full scan, and a back‑table lookup is still required.

Optimization Strategies

1. Range query (ID continuity)

Principle : Record the last id of the previous page and query the next page by a range condition.

SELECT * FROM t_order WHERE id > 100000 AND id <= 100010 ORDER BY id;

Advantages

Direct index lookup, minimal rows scanned.

No OFFSET calculation.

Limitations

Requires monotonically increasing IDs; deletions or rollbacks break continuity.

Only works when ordering by the primary key.

Concurrent inserts can cause duplicates or gaps.

2. Subquery optimization

Principle : First fetch the starting primary‑key value with a subquery, then filter the main query.

SELECT * FROM t_order
WHERE id >= (SELECT id FROM t_order WHERE id > 1000000 LIMIT 1)
LIMIT 10;

Workflow

Subquery quickly finds the first ID (e.g., 1000001).

Main query uses that ID to limit the scan range.

Advantages

Uses primary‑key index for fast location.

Avoids full table scan.

Limitations

Subquery may create a temporary table.

Only suitable for ascending ID order.

Complex multi‑condition pagination may not apply.

3. Delayed join (INNER JOIN)

Principle : Perform pagination in a subquery that returns only IDs, then join back to the main table.

SELECT t1.* FROM t_order t1
INNER JOIN (SELECT id FROM t_order WHERE id > 1000000 LIMIT 10) t2
ON t1.id = t2.id;

Workflow

Subquery fetches the 10 target IDs using the primary‑key index.

INNER JOIN retrieves the full rows.

Advantages

No temporary table generated by the subquery.

Efficient index join reduces back‑table lookups.

Usually faster than a plain subquery.

4. Covering index

Principle : Create an index that contains all columns needed by the query, eliminating the need for a back‑table lookup.

SELECT id, code, type FROM t_order
ORDER BY code
LIMIT 1000000, 10;

Benefits

All required fields are in the index, so no extra row fetch.

Sequential I/O on the index reduces random I/O.

Applicable scenarios

Result set contains a small, fixed set of columns.

Sorting column matches the index.

Limitations

Large result sets may cause MySQL to ignore the index.

Additional storage is required for the covering index.

Comparison and Recommendation

Range queries are simplest and fastest but need continuous IDs. Subqueries reduce the scan range but may generate temporary tables. Delayed joins avoid temporary tables and generally outperform subqueries, at the cost of added query complexity. Covering indexes give the best performance when applicable but require pre‑built indexes and may fail on large result sets. Choose the method that matches data distribution and workload.

Practical Considerations

In high‑concurrency environments, handle possible duplicate or missing rows when using ID‑based pagination (e.g., also record a timestamp).

Design indexes carefully, balancing storage cost against query speed.

Use FORCE INDEX only when you fully understand its impact.

For complex pagination, consider redesigning the logic (e.g., time‑range paging).

Advanced Strategies

5.1 Sharding / Partitioning

Horizontal sharding splits a large table into multiple physical tables or databases, reducing the per‑shard row count.

Implementation options

ID hash sharding.

Time‑range sharding (e.g., order creation date).

Consistent hashing to avoid hot spots during data migration.

Pros

Linear performance improvement as each shard holds fewer rows.

Supports horizontal scaling.

Cons

Increases system complexity; requires cross‑shard query handling.

Needs routing logic and data‑sync mechanisms.

5.2 Caching intermediate results

Cache frequently accessed page data or offset IDs in Redis or Memcached to reduce database load.

Pros

Significantly lowers query frequency.

Improves response time.

Cons

Not suitable for real‑time data requirements.

Cache expiration must be designed to avoid stampedes.

5.3 Using a search engine

Offload pagination to a search engine such as Elasticsearch, which natively supports efficient deep pagination and complex filters.

Implementation steps

Synchronize MySQL data to the search engine via scheduled jobs or message queues.

Forward pagination requests to the search engine.

Pros

Handles complex queries and sorting effortlessly.

Deep pagination performance is excellent for massive data sets.

Cons

Adds system complexity and maintenance cost.

Data synchronization latency must be considered for real‑time needs.

Conclusion

Deep pagination degrades performance because a large OFFSET forces MySQL to scan and discard many rows, incurring high I/O and CPU costs. By redesigning pagination (range query, subquery, delayed join, covering index) and, when necessary, applying architectural optimizations (sharding, caching, search engines), you can achieve substantial performance gains. Select the technique that aligns with your workload and be aware of its trade‑offs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceindexingshardingcachingmysqlDatabase Optimizationpagination
Cognitive Technology Team
Written by

Cognitive Technology Team

Cognitive Technology Team regularly delivers the latest IT news, original content, programming tutorials and experience sharing, with daily perks awaiting you.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.