How to Design Pagination for Billion‑Row Sharded Databases in an Interview
The article systematically breaks down pagination challenges in billion‑row sharded databases, compares common sharding strategies and middleware architectures, analyzes the performance drawbacks of a naïve global‑query approach, and presents several practical alternatives—including keyset pagination, two‑stage queries, index‑table tricks, and external search or NewSQL solutions—while highlighting their trade‑offs for interview discussions.
Common Sharding Strategies
Hash Modulo – e.g., user_id % 16 distributes rows across 16 tables. Even data distribution, but range queries must scan all shards.
Range Partitioning – tables split by a field such as month, e.g., orders_202401, orders_202402. Fast range queries and hot‑cold separation, yet recent shards become write hotspots.
Routing (Middle) Table – a separate mapping table records which physical shard holds a primary key. Adds an extra lookup and complexity but provides flexible routing.
In practice these strategies are often combined, e.g., hash‑by‑user‑id then range‑by‑time within each database.
Sharding Implementation Modes
SDK Mode – a jar is linked directly into the application. Routing and aggregation happen in‑process, giving the lowest latency, but the client is tightly coupled to a specific language and version upgrades require coordinated redeployment.
Proxy Mode – an independent proxy pretends to be a database. All SQL passes through the proxy, which parses, routes, executes and merges results. Language‑agnostic, but introduces an extra network hop, becomes a potential bottleneck and single point of failure, requiring HA components such as LVS/Nginx and Keepalived.
Sidecar Mode – the sharding capability runs as a sidecar alongside the service in a service‑mesh environment. It aims to combine the low‑coupling of SDK with the language‑agnostic nature of Proxy, but mature open‑source implementations are scarce.
Performance ranking: SDK > Sidecar > Proxy.
Global Query Method
A plain SELECT … ORDER BY id LIMIT 4 OFFSET 2 works on a single table but fails on sharded tables because OFFSET and SIZE are meaningless across shards. The naive fix is to broadcast the query to every shard, request size+offset rows from each, merge them in memory, then apply the global offset and limit.
SELECT * FROM order_tab ORDER BY id LIMIT 6 OFFSET 0; -- (4 + 2 = 6)Performance analysis for deep pagination (e.g., LIMIT 10 OFFSET 10000) with 10 shards:
Network overhead : each shard would transmit ~10 001 rows. Assuming 1 KB per row, total traffic ≈ 97.7 MB while only 10 KB of useful data is needed.
Memory consumption : the proxy must load all transmitted rows for sorting, risking OOM on large offsets.
CPU load : merge‑sort of the combined result set is CPU‑intensive and becomes the system bottleneck.
Optimized Pagination Designs
Keyset Pagination (disable OFFSET)
For infinite‑scroll scenarios, remember the last row’s sorting key and request the next page with a WHERE id > last_id clause. Example for a page size of 50:
-- First page
SELECT * FROM order_tab ORDER BY id LIMIT 50;
-- Subsequent page (last_id = 1050)
SELECT * FROM order_tab WHERE id > 1050 ORDER BY id LIMIT 50;Composite keys are handled by extending the predicate, e.g.,
WHERE (create_time < last_create_time) OR (create_time = last_create_time AND id < last_id).
Two‑Stage Query
Works by first querying each shard with a reduced offset ( global_offset / shard_count) and then issuing a second query that expands the range using a BETWEEN clause based on the minimum value returned in the first stage.
Concrete example with three shards storing ages 1‑30 and a global request LIMIT 5 OFFSET 10:
Rewrite to LIMIT 5 OFFSET 3 (10/3) and run on every shard.
Collect the smallest returned age ( age_min) and the per‑shard offsets.
Issue a second query on each shard: SELECT * FROM T WHERE age BETWEEN age_min AND max_age, where max_age is the maximum age returned by that shard in step 1.
Merge, sort, and finally apply the original OFFSET 10 LIMIT 5 on the merged list.
The method assumes roughly even data distribution; otherwise rows may be missed or duplicated.
Index Table (Materialised View)
Create an unsharded index table that stores only the sorting key and primary key of each row. Pagination is performed on this table using normal LIMIT/OFFSET. The resulting primary keys are then used to fetch full rows from the appropriate shards via an IN query.
Challenges:
Data consistency : keeping the index table in sync with source tables. Options are synchronous double‑write (adds latency and distributed‑transaction complexity) or asynchronous binlog replication (e.g., Canal), which yields eventual consistency.
Query capability : the index table can only filter on columns it contains; adding more columns increases storage and maintenance cost.
External Storage
Search Engine – replicate data to Elasticsearch via double‑write or binlog sync. Elasticsearch handles pagination, sorting and complex filters efficiently; the application then retrieves full records from the primary DB using the returned IDs.
Distributed NewSQL – TiDB (built on TiKV) provides a globally ordered key‑value store, allowing ORDER BY and LIMIT to be executed across shards natively, making pagination transparent to the application.
Summary
Pagination over sharded tables is not a trivial SQL tweak; it involves network, memory and CPU trade‑offs, data‑consistency considerations, and user‑experience constraints. No single approach fits all scenarios. The appropriate design must be chosen based on data distribution, query patterns, performance requirements and operational complexity.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
