Mastering Sharded Pagination: 3 Proven Techniques to Avoid Performance Pitfalls
This article explains why using traditional LIMIT pagination on sharded databases can cause full‑shard scans and memory‑intensive sorting, and introduces three practical alternatives—anchor pagination, shard‑mark pagination, and reverse pagination—detailing their implementation steps, performance benefits, and common pitfalls to avoid.
Hello, I am Su San. An online issue occurred where exporting three months of orders and clicking to page 100 directly crashed the database.
Sharded LIMIT is a Performance Killer
Using LIMIT offset,size works on a single table, but on sharded databases each shard must scan up to the offset, leading to full‑shard scans and massive memory sorting. For example, LIMIT 10000, 10 (page 1001) forces each of 8 shards to retrieve 10,010 rows, then the application sorts all 80,000+ rows, causing high latency and possible service circuit‑breakers.
Key principle: Do not chase raw performance without matching the business scenario.
Anchor Pagination
Anchor pagination is the most efficient, suitable for “load more” scenarios. It uses an ordered column (e.g., auto‑increment ID or timestamp) as an anchor instead of offset.
Paginate by ID range
Assume an order table is split into three shards by ID range:
Shard 1: ID 1-10000
Shard 2: ID 10001-20000
Shard 3: ID 20001-30000
To fetch page 2 (10 rows per page):
Query the first page with ORDER BY id DESC LIMIT 10 and record the last ID (e.g., last_id=100) as the anchor.
Query the second page with WHERE id < 100 ORDER BY id DESC LIMIT 10.
Subsequent pages use the last ID of the previous page as the new anchor.
Why it’s fast
Each shard executes WHERE id < xxx LIMIT 10, returning only 10 rows, so network and memory costs drop to roughly 1/1000 of the offset‑based approach.
Two pitfalls
Do not use hash sharding on the ID; otherwise each shard’s data is unordered and full scans are required.
Anchor pagination does not support arbitrary page jumps; it works only for sequential “load more” flows.
Shard‑Mark Pagination
When true page jumps are required (e.g., admin reports), maintain metadata that maps each database/table to its ID range and total row count, typically stored in Redis.
Example metadata (simplified):
DB 1‑Table 202401: IDs 1‑5000, 5000 rows
DB 1‑Table 202402: IDs 5001‑12000, 7000 rows
DB 2‑Table 202401: IDs 12001‑18000, 6000 rows
To fetch LIMIT 15000,10 (page 1501):
Locate the target shard by accumulating row counts until the offset falls within a shard (here DB 2‑Table 202401).
Compute the intra‑shard offset: 15000‑12000 = 3000, then query LIMIT 3000,10 on that shard.
Return the result directly, as the global order is preserved.
Two pitfalls
Metadata must be kept up‑to‑date; use periodic aggregation plus incremental logs instead of heavyweight distributed locks.
The method only works when sorting by the sharding key; sorting by a non‑shard column still requires full‑shard scans.
Reverse Pagination
Reverse pagination is useful for fetching the last page only. Instead of a large offset, query the first few rows in ascending order to obtain a reverse anchor, then retrieve the final rows.
Steps
Execute ORDER BY id ASC LIMIT 10 to get the smallest IDs and pick the maximum as the reverse anchor.
Query WHERE id > anchor ORDER BY id DESC LIMIT 10 to obtain the last 10 rows.
If needed, reverse the result order for normal display.
Why it works
Scanning the first 10 rows on each shard is far faster than scanning with LIMIT 9990,10; the combined result yields the desired tail data with minimal overhead.
Two pitfalls
Only applicable to the very last page; cannot fetch arbitrary earlier pages.
Large deletions that affect the tail can invalidate the anchor, requiring recalculation.
Final Thoughts
The essence of sharded pagination is balancing business requirements with technical constraints. Prefer anchor pagination for “load more” UI, shard‑mark pagination when page jumps are mandatory (accepting eventual consistency), and resort to middleware solutions like ShardingSphere only when necessary.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
