Databases 6 min read

How to Paginate Efficiently Across Sharded Databases Without OOM

This article explains why pagination only becomes complex when querying across sharded databases, compares three common solutions—including global search, sequential ID‑based paging, and the discouraged secondary query method—provides practical SQL examples, outlines pros and cons, and offers a STAR‑style interview answer.

Selected Java Interview Questions
Selected Java Interview Questions
Selected Java Interview Questions
How to Paginate Efficiently Across Sharded Databases Without OOM

Identify Whether Pagination Is Cross‑Shard

Pagination becomes a problem only when a query spans multiple shards. If the WHERE clause always contains the sharding key (e.g., a specific user_id or a time bucket), the query is routed to a single shard and can use ordinary LIMIT/OFFSET logic.

First step: Determine whether the query can be satisfied by a single shard.

Typical Solutions for True Cross‑Shard Pagination

When the query really needs to aggregate results from several shards, three common approaches are used:

Solution Overview Diagram
Solution Overview Diagram

1. Global Search (Middleware/Database)

The middleware issues the same SELECT on every shard, collects the rows, sorts them globally, and then returns the requested page. This method is correct but its cost grows linearly with the page number because each shard must scan and sort all rows up to the requested offset.

Example (ShardingSphere‑style):

SELECT * FROM user_log
WHERE ...   -- executed on every shard
ORDER BY create_time DESC
LIMIT 100000, 20;

For deep pages (e.g., page 100 000) the memory consumption can reach OOM and response time becomes unacceptable.

2. Sequential‑ID Paging (Recommended for Deep Pagination)

Instead of using OFFSET, keep the maximum primary‑key value of the previous page and request the next page with a “greater‑than” predicate. This turns the query into a range scan, which is O(1) with respect to page depth.

SQL pattern for the first page:

SELECT * FROM user_log
ORDER BY id ASC
LIMIT 20;

SQL pattern for subsequent pages (assuming the last row of the previous page had id = 12345):

SELECT * FROM user_log
WHERE id > 12345
ORDER BY id ASC
LIMIT 20;

Advantages

Each request reads only one page; performance is independent of the page number.

Uses index range scans, avoiding large offsets and in‑memory sorting.

Disadvantages

Cannot jump directly to an arbitrary page; the client must request pages sequentially.

Requires a monotonically increasing identifier (or a composite key that preserves order).

3. Secondary Query Method (Not Recommended)

This technique performs an initial query to obtain the IDs of the desired page, then issues a second query to fetch the full rows by those IDs. It is complex, sensitive to data skew, and often yields inconsistent results, so it is generally discouraged.

Practical Guidance for Interviews

When asked about cross‑shard pagination, follow a concise structure:

Confirm whether the business can limit the query to a single shard.

If cross‑shard is unavoidable, mention middleware‑level global search but note its deep‑page limitation.

Recommend delegating the problem to a dedicated search engine (Elasticsearch) or a distributed SQL engine (TiDB) when possible.

If the solution must stay inside the relational database, propose sequential‑ID paging as the performant fallback.

Summary

Only queries that span shards need special pagination logic. The three main categories of solutions are:

Middleware/database global search (simple but costly for deep pages).

Open‑source frameworks that implement similar logic (convenient but share the same performance pitfall).

Business‑level compromises, such as avoiding deep pagination or using a dedicated search system.

For most production scenarios—feeds, message lists, audit logs—the best practice is to disallow jump paging and use sequential paging based on the previous page’s maximum ID .

SQLshardingPagination
Selected Java Interview Questions
Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.