Implementing Efficient Pagination in Sharded Databases: The No‑Skip Approach
The article analyzes why traditional offset‑based pagination breaks after data is split across multiple databases, explains the global query method with its pros and cons, and introduces the no‑skip optimization that reduces data transfer and improves deep‑page performance.
With the rapid growth of internet services, large companies often store billions of rows—such as orders or logistics—in a single table, putting heavy pressure on the database. To handle this scale, the common solution is sharding (分库分表), which distributes data across multiple databases or tables to improve read/write performance.
Sharding solves the large‑table problem but creates a new challenge: pagination. When data is horizontally split, a simple SELECT * FROM order ORDER BY create_time ASC LIMIT 3,3 that works on a single table no longer returns the correct second‑page rows because the target rows may reside in different databases. The article illustrates this with an order‑id modulo strategy that distributes orders between database_0 and database_1, showing the loss of a global view with two diagrams.
To obtain correct pagination results, the article proposes the global query method . The idea is to query each shard for enough rows, merge the results in the service layer, sort them globally, and then select the desired page. Three data‑distribution scenarios are examined:
Data evenly split across both databases – each shard returns half the page and the merged result yields the correct rows.
All required rows reside in a single shard – that shard returns the full page while the other returns none.
Required rows are spread across both shards – each shard returns partial rows, which are combined to form the final page.
Each scenario is illustrated with diagrams.
The global query method has clear advantages: it provides lossless, precise data retrieval across shards. However, it also has drawbacks—each shard must return more rows, increasing network traffic; the service layer must perform a second sort, adding CPU load; and performance degrades with deep pagination.
To mitigate these issues, the article introduces the no‑skip method . Instead of allowing arbitrary page jumps, the system only fetches the first page from each shard, records the maximum ID, and for subsequent pages queries from that ID onward. This eliminates the need to retrieve large intermediate result sets and improves efficiency, as shown in the accompanying diagram.
In summary:
The global query method enables lossless, accurate pagination but suffers from deep‑page performance problems.
When business logic permits, the no‑skip method can substantially improve efficiency.
Middleware such as Elasticsearch can store data for additional capabilities, but its own deep‑pagination and synchronization issues must be considered.
ShardingSphere offers built‑in pagination optimizations (SQL rewrite, global lookup, streaming sort/filter).
If acceptable, providing only partial data per request can further reduce load.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
