Backend Development 24 min read

Mastering Pagination in Billion‑Row Sharded Databases: Interview‑Ready Strategies

This article dissects the challenges of implementing pagination on billion‑row sharded tables, explains common sharding strategies, compares SDK, Proxy, and Sidecar architectures, and presents practical solutions such as global query, infinite scroll, two‑phase queries, index tables, and external storage to help engineers ace interview questions.

Su San Talks Tech

Sep 22, 2025

Mastering Pagination in Billion‑Row Sharded Databases: Interview‑Ready Strategies

1. Common Sharding Strategies

Before tackling pagination, we first clarify the basic components and concepts of sharding, focusing on horizontal partitioning where the table schema remains unchanged and data is split across multiple tables for high performance.

1.1 Horizontal Sharding

Typical approaches include:

Hash Modulo : Compute a hash of a key (e.g., user_id) and take modulo of the number of shards, e.g., user_id % 16 to distribute rows across 16 tables.

Pros : Even data distribution, low risk of hot spots. Cons : Range queries are difficult because data cannot be located by range.

Range Partitioning : Split data by a field range, typically time, e.g., orders_202401, orders_202402, …

Pros : Efficient range queries, natural hot‑cold data separation. Cons : Hot‑spot risk as new writes concentrate on the latest table.

Routing Table : Maintain a mapping table that records which physical shard stores a given primary key. This adds an extra lookup and complexity.

These strategies are often combined, e.g., hash‑shard by user_id then range‑shard by time within each database.

1.2 Sharding Implementation Modes

Sharding logic is usually handled by middleware, which comes in three forms:

SDK Mode : A JAR is directly included in the application. It offers the lowest latency because routing and aggregation happen inside the process, but it tightly couples to a specific language and raises version‑management complexity.

Proxy Mode : An independent proxy service pretends to be a database. All SQL passes through the proxy for parsing, routing, and execution, then merges results. It is language‑agnostic but adds network overhead and can become a bottleneck or single point of failure.

Sidecar Mode : Deployed alongside the service in a service‑mesh, combining low coupling of SDK with language‑agnostic proxy. Mature open‑source products are still scarce.

Among these, SDK offers the best performance but is language‑specific; Proxy has the weakest performance but works with any stack.

2. Global Query Method

In a sharded environment, a simple LIMIT offset, size fails because data is physically isolated. The solution is to broadcast the query to all shards, collect results, perform a global sort in middleware, and then apply pagination.

SELECT * FROM order_tab ORDER BY id LIMIT 4 OFFSET 2;

Assuming data is split by user_id % 2 into order_tab_0 and order_tab_1, various distribution scenarios illustrate why the original offset/size semantics break.

To guarantee correctness, the middleware rewrites the query to LIMIT size+offset OFFSET 0, fetches that many rows from each shard, merges and sorts them, then discards the first offset rows.

Network Overhead : For LIMIT 10 OFFSET 10000 across 10 shards, roughly 100 MB of unnecessary data may be transferred. Memory Consumption : Loading all fetched rows can cause OOM on deep pagination. CPU Load : Sorting large datasets is CPU‑intensive.

An optimization is to use multi‑way merge sort, keeping only a small heap of the current smallest rows from each shard, thus reducing memory usage.

3. Interview‑Ready Solutions

Beyond the global query, three practical approaches can be discussed in interviews:

3.1 Disable Offset‑Based Pagination (Infinite Scroll)

Restrict the UI to forward‑only loading. The client sends the last record’s key (e.g., max_id) instead of an OFFSET. Example:

-- First page
SELECT * FROM order_tab ORDER BY id LIMIT 50;

-- Subsequent page
SELECT * FROM order_tab WHERE id > 1050 ORDER BY id LIMIT 50;

This keeps LIMIT constant and eliminates offset‑related performance degradation.

3.2 Two‑Phase Query (Highlight 1)

When precise jump‑page is required, perform an initial query with a reduced offset (global offset divided by shard count), find the minimum key across shards, then issue a second BETWEEN query to fetch a superset, finally compute the exact page.

Example steps:

Rewrite LIMIT 5 OFFSET 10 to LIMIT 5 OFFSET 3 (10/3 shards ≈ 3) and run on each shard.

Identify the smallest age_min among returned rows.

Issue a second query WHERE age BETWEEN age_min AND max_returned on each shard.

Merge, sort, and apply the original offset/limit to obtain the final result.

The method works well when data is evenly distributed; otherwise it may produce incorrect pages.

3.3 Index Table (Highlight 2)

Create a non‑sharded index table (or materialized view) that stores only the sorting key and primary key of each row, e.g., (id, update_time, shard_id). Pagination is performed on this small table:

SELECT primary_key, shard_id FROM index_table ORDER BY update_time DESC LIMIT 10 OFFSET 100;

Then fetch the full rows from the corresponding shards using IN. Challenges include keeping the index table consistent (dual‑write or async binlog replication) and the limitation that all filter conditions must exist in the index table.

3.4 External Storage

For complex sorting and filtering, offload data to specialized systems:

Search Engine : Sync data to Elasticsearch and let it handle pagination, sorting, and filtering, then retrieve full records by ID.

NewSQL Distributed DB : Use TiDB/TiKV which provides a globally ordered key‑value store, allowing efficient ORDER BY and LIMIT across shards without application‑level sharding logic.

4. Summary

Pagination on sharded databases is not a simple SQL problem; it involves performance, consistency, and user‑experience trade‑offs. No single solution fits all scenarios—engineers must understand business requirements and choose between global query, infinite scroll, two‑phase queries, index tables, or external storage based on acceptable complexity and performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization Sharding distributed databases backend interview

Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.