Databases 7 min read

Implementing Efficient Pagination Across Sharded Databases

The article analyzes why traditional LIMIT/OFFSET pagination fails when data is split across multiple databases, presents a global query approach with its trade‑offs, and proposes an optimized "no‑skip" method plus practical tips using ShardingSphere and Elasticsearch.

Linyb Geek Road
Linyb Geek Road
Linyb Geek Road
Implementing Efficient Pagination Across Sharded Databases

1. Root Cause of Pagination Under Sharding

When a large‑scale application stores a single logical table across multiple physical databases (e.g., database_0 and database_1) using an order‑id modulo strategy, each database loses the global view of the data.

Consequently, a simple SELECT * FROM order ORDER BY create_time ASC LIMIT 3,3 that works on a single table cannot retrieve the correct second page when rows are distributed across databases.

2. Global Query Method

To obtain a correct page, the service queries each shard for more rows than needed, merges the results in memory, and then selects the desired page. Three distribution scenarios are considered:

Data evenly split between the two databases – each shard returns half of the required rows.

All required rows reside in a single database – that shard returns the full set while the other returns none.

Required rows are spread across both databases – each shard returns a portion, and the service merges them.

Implementation steps:

Each shard returns two pages of data (e.g., 4 rows for a page size of 2).

The service concatenates the results, performs a global sort by create_time, and extracts the final page.

Advantages: provides a complete, accurate view of the data without loss.

Disadvantages: each shard must transmit more rows, increasing network traffic; the service layer performs an extra sort, raising CPU load; deep pagination suffers from performance degradation.

3. Optimized Approach – "No‑Skip" Method

Because many products only support "next page" navigation, the article adopts a similar compromise to reduce overhead. The method works as follows:

Query the first page from each shard and record the maximum id returned.

When fetching the next page, include a condition such as WHERE id > max_id_of_previous_page so that each shard continues from where it left off.

This "no‑skip" strategy reduces the amount of data transferred per request and mitigates the deep‑pagination penalty of the pure global query method.

4. Practical Takeaways

(1) The global query method guarantees loss‑less, precise pagination but suffers from deep‑page performance issues.

(2) When business rules allow, the no‑skip method can improve efficiency.

(3) Middleware such as Elasticsearch can be used to store a copy of the data, but its own deep‑pagination and synchronization challenges must be considered.

(4) ShardingSphere offers built‑in pagination optimizations (e.g., automatic SQL rewrite, global lookup, streaming processing).

(5) If permissible, limiting the UI to "partial data" (e.g., only the first few pages) further reduces load.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

databaseElasticsearchShardingShardingSpherepagination
Linyb Geek Road
Written by

Linyb Geek Road

Tech notes

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.