Databases 13 min read

How to Handle Pagination After MySQL Sharding: Pitfalls and Solutions

When a MySQL table is sharded into multiple databases or tables, the usual LIMIT offset, pagesize pagination breaks, and this article examines why simple merge fails, then presents three practical approaches—global limit, secondary queries, and jump‑page restriction—highlighting their trade‑offs and limitations.

ITPUB
ITPUB
ITPUB
How to Handle Pagination After MySQL Sharding: Pitfalls and Solutions

1. Impact of Sharding on Pagination

Assume a table with eight sequential IDs (1‑8). With a single table, LIMIT 1,2 returns records 2 and 3. After splitting the table into two shards, two common strategies are considered:

Segment method : split by time or month, e.g., (1,2,3,4) and (5,6,7,8).

Modulo‑distribution method : split by id % 2, e.g., (1,3,5,7) and (2,4,6,8).

Applying the original LIMIT 1,2 on each shard and then merging the results works only when the requested page lies entirely within one shard. If the page spans shards (e.g., LIMIT 3,2), the merged result is incorrect (e.g., (4,8) instead of (4,5)). Thus, simple merge cannot guarantee correct pagination.

2. Global Method (limit x+y)

The idea is to expand each shard's limit to cover all rows up to the end of the desired page: replace LIMIT offset, pagesize with LIMIT 0, offset+pagesize. After fetching these larger result sets from every shard, merge them, sort, and finally apply the original offset and pagesize in memory.

Example for LIMIT 1,2 on the two‑shard segment split:

Shard 1: LIMIT 0,3 → (1,2,3)

Shard 2: LIMIT 0,3 → (5,6,7)

Merge → (1,2,3,5,6,7) → apply original offset → (2,3) – correct.

The same works for LIMIT 3,2 yielding (4,5) after merging.

Drawback: the query may retrieve a huge amount of data (e.g., LIMIT 10000000,10 becomes LIMIT 0,10000010), causing timeouts or OOM in the application.

3. Secondary Query Method

This method, also from the “cross‑database pagination” article, assumes that data is evenly distributed across shards. It proceeds in two rounds:

Rewrite the original LIMIT offset, pagesize as LIMIT offset/n, pagesize where n is the number of shards (rounding down the offset division). Execute this on each shard.

From the first round results, find the minimum id (or other ordering key) – min_id. Then issue a second query on each shard with WHERE id BETWEEN min_id AND max_id (or the optimized BETWEEN min_id AND origin_min_id) to fetch the missing rows.

Merge the two‑round results, de‑duplicate, sort, and finally take pagesize rows starting from the original offset.

Examples:

Scenario 1 (modulo distribution)

Original sequence (1‑8), need LIMIT 2,2 → (3,4).

First round: shard A returns (3,5), shard B returns (4,6); min_id = 3.

Second round: shard A BETWEEN 3 AND 5 → (3,5); shard B BETWEEN 3 AND 6 → (4,6).

Merge → (3,4,5,6) → take first 2 → (3,4) – correct.

Scenario 2 (modulo distribution, offset not divisible)

Need LIMIT 1,2 → (2,3).

First round (rounded down): shard A → (1,3), shard B → (2,4); min_id = 1.

Second round: shard A BETWEEN 1 AND 3 → (1,3); shard B BETWEEN 1 AND 4 → (2,4).

Merge → (1,2,3,4) → after discarding the first element (because of rounding) → (2,3) – correct.

Scenario 3 (segment method – fails)

Need LIMIT 2,2 → (3,4). The first round already loses the correct rows, so the second round cannot recover them.

Scenario 4 (modulo distribution with missing data)

If some rows are deleted from a shard, the second‑round query may still return incorrect results, as shown by the example where the expected (5,6) becomes (3,5).

4. Disallow Jump Paging

One pragmatic approach is to forbid arbitrary page jumps. Instead, keep track of the maximum id of the previous page and request the next page with WHERE id > previous_max_id LIMIT pagesize. This sacrifices user experience but avoids the pagination inconsistency.

Conclusion

There is no universal solution for pagination on sharded tables. Global limit works but may be too heavy; secondary‑query method works under strict data‑distribution assumptions; and restricting navigation to sequential pages avoids the problem altogether, leaving pagination after sharding an open challenge in practice.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceSQLdatabaseshardingmysqlpagination
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.