Databases 12 min read

Why Traditional Pagination Fails After Sharding and How to Solve It

When a table grows beyond ten million rows, sharding it across multiple databases improves performance, but the usual LIMIT offset, pagesize pagination breaks, leading to missing or incorrect records; this article examines why simple merge approaches fail and evaluates global, secondary query, and no‑skip paging strategies, highlighting their trade‑offs.

Java Interview Crash Guide
Java Interview Crash Guide
Java Interview Crash Guide
Why Traditional Pagination Fails After Sharding and How to Solve It

Impact of Sharding on Pagination

When a table reaches a certain size (e.g., MySQL single‑table rows > 10 million), sharding the data into multiple tables or databases can greatly improve read/write performance. However, the simple SELECT * FROM table LIMIT offset, pagesize pagination becomes complicated after sharding.

Assume a table with 8 ordered records (ID 1‑8). If we split it into two tables using either range‑based segmentation (e.g., monthly tables) or modulo‑based distribution, the naive approach of applying LIMIT 1,2 on each shard and then merging the results can produce incorrect pages when the desired page spans multiple shards.

Examples show that merging the two sub‑lists (2,3) and (6,7) yields (2,3,6,7) and then selecting the first two records gives (2,3), which seems correct, but when the page starts at offset 3 ( LIMIT 3,2) the merged result becomes (4,8) instead of the expected (4,5).

Conclusion: Simple merge strategies cannot reliably solve pagination after sharding, regardless of the sharding method.

Global Method (limit x+y)

The idea is to expand the range on each shard: rewrite LIMIT offset, pagesize as LIMIT 0, offset+pagesize, fetch all rows up to the end of the desired page from each shard, merge them in memory, then apply the original offset and pagesize. This works for both range and modulo sharding but can retrieve a huge amount of data, leading to query timeouts and excessive memory usage, especially for large offsets.

Secondary Query Method

This method, also mentioned in the “cross‑database pagination” article, assumes that data is evenly distributed across shards. Steps:

Rewrite the original LIMIT offset, pagesize to LIMIT offset/n, pagesize where n is the number of shards, and execute it on each shard.

Find the minimum id among the results (min_id) to determine the correct start point.

Run a second query on each shard with WHERE id BETWEEN min_id AND origin_max_id (or an optimized version using origin_min_id) to fetch the missing rows.

Merge, deduplicate, and then take pagesize rows starting from the appropriate offset.

Examples demonstrate that this approach can produce correct results for both range‑based and modulo‑based sharding, provided the data is evenly balanced.

Prohibit Page Skipping

Another approach is to allow only forward or backward navigation: record the maximum id of the current page and for the next page query WHERE id > last_max_id LIMIT pagesize. This sacrifices user experience but avoids the pagination inconsistency.

Conclusion

There is no universal solution for pagination on sharded tables. The global method suffers from performance issues, the secondary‑query method requires balanced data distribution, and prohibiting page skipping compromises usability. Pagination across shards remains an industry‑level challenge.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

SQLdatabaseshardingmysqlpagination
Java Interview Crash Guide
Written by

Java Interview Crash Guide

Dedicated to sharing Java interview Q&A; follow and reply "java" to receive a free premium Java interview guide.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.