Mastering Pagination Performance: Single‑DB and Cross‑DB Strategies
This guide explores common pagination bottlenecks and presents practical solutions for single‑database and sharded environments, covering keyset pagination, bidirectional paging, join‑based paging, index design, middleware rewriting, approximate paging, and two‑phase query techniques.
During years of work as an architect, the author has repeatedly encountered pagination as the most frequent performance hotspot. The article consolidates practical solutions for both single‑database and cross‑database pagination scenarios.
1. Single‑Database Pagination
Scenario 1 – Next‑Page Only
Implementation point: Use a unique, ordered column (e.g., auto‑increment primary key or timestamp) as the pagination cursor. The last value of the previous page becomes the start point for the next query.
SQL example:
SELECT * FROM table WHERE id > ? ORDER BY id ASC LIMIT n</{{{ } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } }Index strategy: Create a composite index on the WHERE filter fields and the ORDER BY fields (e.g., created_at, id). If all needed columns are covered, a covering index avoids table look‑ups.
Scenario 2 – Next and Previous Page
Implementation point: Record both the first and last ordered values of the current page. Use the first value for reverse queries (previous page) and the last value for forward queries (next page).
SQL example: SELECT * FROM table WHERE id < ? ORDER BY id DESC LIMIT n for previous page, and SELECT * FROM table WHERE id > ? ORDER BY id ASC LIMIT n for next page.
Index strategy: Same composite index as Scenario 1, but ensure it supports both ascending and descending scans.
Scenario 3 – Jump to Arbitrary Page
Implementation point: Avoid large OFFSET by first locating the target page's primary key via a sub‑query with a covering index, then join back to fetch full rows (delayed join).
SQL example:
SELECT id FROM table ORDER BY created_at DESC, id DESC LIMIT 1 OFFSET (page‑1)*size;followed by SELECT * FROM table WHERE id IN (…).
Index strategy: The sub‑query must be covered by the same composite index; otherwise performance gains are limited.
Scenario 4 – Join‑Based Pagination
If only fields from the main table are needed, query the main table first and then fetch related data. When join conditions are required, consider the following alternatives:
Option 1 – Data Redundancy: Duplicate frequently filtered fields into the main table to avoid joins.
Option 2 – Primary‑Key Sub‑Query + Join: Retrieve primary keys that satisfy the join condition, then join back.
Option 3 – Materialized View: Create a view that pre‑joins and indexes the necessary columns (adds write‑side complexity).
Option 4 – Search Engine Integration: Store filter and sort fields in Elasticsearch, query IDs, then fetch rows from the DB (adds external storage overhead).
Scenario 5 – Choosing the Right Approach
Prefer Scenarios 1 and 2 whenever business logic permits, as they avoid costly OFFSET.
For deep jump‑page requirements, delayed join (Scenario 3) is effective but still degrades with very deep pages; consider limiting maximum page depth.
Index design is critical: always build a composite index on filter and sort columns.
Denormalization (redundant fields) can simplify join pagination when updates are infrequent.
Other optimizations: fetch one extra row to detect the existence of a next page, avoid COUNT(*) on large tables, or use estimated counts.
2. Cross‑Database (Sharding) Pagination
1. Global‑View Middleware (e.g., ShardingSphere)
The middleware rewrites a standard SQL into multiple shard‑specific queries, merges results, and then applies the final OFFSET/LIMIT. This works but requires fetching all preceding rows, which hurts performance for deep pages.
2. Business Trade‑off – Disallow Deep Jump Pages
Replace deep OFFSET with a range condition (e.g., WHERE id > last_seen_id) to keep queries fast. This aligns with the keyset pagination approach discussed earlier.
3. Approximate Paging via Even Distribution
Assuming uniform data distribution across shards, split the global offset evenly. For example, to fetch page 100 (offset 9900, limit 100) across two shards, query each shard with OFFSET 4950 LIMIT 50. The merged result approximates the desired page, acceptable when exact precision is not required.
4. Two‑Phase Query Method
Given a sharded order_tab split into order_tab_0 and order_tab_1:
First query – boundary detection: Distribute the global OFFSET and LIMIT evenly to each shard (e.g., OFFSET 2 LIMIT 4 per shard) to obtain candidate rows and the minimum key value.
Second query – range expansion: Use the minimum key from step 1 and each shard’s maximum key to construct a BETWEEN query that fetches a slightly larger window, ensuring the target page is fully covered.
Service‑layer merge: Collect all rows, globally sort, and compute the exact global offset of the target key. The final offset equals the sum of local offsets where the target key appears.
Result extraction: Knowing the exact global position, fetch the next n rows starting from that position (e.g., LIMIT 4 OFFSET 5 yields rows 5‑8).
This approach avoids deep OFFSET on any single shard and works without sacrificing accuracy, provided each shard contains enough rows relative to the offset.
5. Search‑Engine Aggregation
Similar to the single‑DB materialized view, aggregate data from multiple tables into a search engine (e.g., Elasticsearch) and query IDs, then retrieve full rows from the database. This adds an external component but can simplify pagination across many shards.
Conclusion
Choose pagination techniques that match the actual workload; never deploy a solution without performance verification.
Technical fixes alone are insufficient—business constraints (e.g., removing deep jump pages or exact total counts) often yield simpler, faster designs.
Avoid introducing unnecessary storage layers unless they bring clear benefits; keep the system maintainable.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect-Kip
Daily architecture work and learning summaries. Not seeking lengthy articles—only real practical experience.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
