Databases 12 min read

Optimizing Distributed Database Sorting: Memory, Disk, and Proxy Strategies

This article examines the architecture of distributed databases, identifies challenges in sorting large result sets across shards, and presents step-by-step memory‑based, disk‑based, and incremental proxy sorting techniques, including buffer sizing, priority‑queue merging, and trade‑offs to avoid OOM and excessive I/O.

ITPUB
ITPUB
ITPUB
Optimizing Distributed Database Sorting: Memory, Disk, and Proxy Strategies

Background

Distributed database architectures typically consist of a coordinator (proxy), data shards, a metadata catalog, and a global clock (gtm). The proxy receives client queries, forwards them to the appropriate shards based on sharding keys, aggregates results, and ensures distributed transaction consistency.

Sorting queries such as SELECT * FROM t1 ORDER BY field1 require the proxy to merge ordered results from multiple shards. For small result sets the proxy can hold all shard data in memory and perform a merge. For large result sets this approach can cause OOM, while spilling to disk risks disk exhaustion and heavy I/O.

Solution Overview

Initial Sorting Scheme

Each shard performs local sorting before sending ordered rows to the proxy. The proxy merges these streams using either merge‑sort or a priority‑queue algorithm, greatly reducing its workload.

The proxy’s sort_buffer size (default 10 MB) is divided among the N participating shards (10 MB/N per shard). The following steps illustrate the process:

Client sends SELECT * FROM t1 ORDER BY id to the proxy.

Proxy forwards the query to each relevant shard group.

Each shard sorts locally and returns ordered rows.

Proxy stores each shard’s rows in its dedicated sort buffer and performs a merge.

Proxy streams the merged result back to the client.

Limitations of the Initial Scheme

The approach works only for modest data volumes. Scaling up the sort_buffer consumes more memory, and unlimited growth is impractical.

Disk‑Based Optimization

For large result sets, shards write their ordered rows to disk, and the proxy merges them from disk files. This avoids memory pressure but introduces significant disk I/O and storage requirements.

Steps:

Client issues the sorting query.

Proxy forwards it to each shard.

Shards sort locally and write ordered rows to disk.

Proxy stores each shard’s disk file reference.

Proxy uses a priority‑queue (heap) to merge rows, pre‑filling each shard’s sort buffer from its disk file to reduce I/O.

When a buffer is exhausted, the proxy reads the next block from the corresponding disk file.

Process repeats until all rows are sent to the client.

Drawbacks include the need for large disk space on the proxy and heavy disk I/O, especially when the client only needs a subset of the total rows (e.g., LIMIT 100w on 50 shards would require storing 5 000 w rows on disk).

Final Optimized Scheme

The refined approach avoids persisting shard data on the proxy’s disk. Instead, the proxy repeatedly pulls fixed‑size ordered chunks from each shard, fills the per‑shard sort buffers, and merges them on‑the‑fly using a priority queue. This reduces both memory and disk usage while maintaining high throughput.

Procedure:

Client sends SELECT * FROM t1 ORDER BY id to the proxy.

Proxy forwards the query to each shard group.

Shards sort locally and return a fixed‑size ordered batch.

Proxy stores each batch in its corresponding sort buffer.

Proxy merges rows using a priority‑queue heap, emitting rows to the client as soon as they are ready.

When a buffer is depleted, the proxy fetches the next batch from that shard.

Process continues until the client receives the required number of rows.

Analysis of the final scheme shows:

Elimination of disk storage on the proxy, removing I/O bottlenecks.

Reduced network traffic because only the necessary amount of data is transferred from each shard.

Predictable memory usage based on the configured sort buffer size (e.g., 2 MB per shard).

Best‑case scenario: data is evenly distributed, and after sending the required 100 w rows, each shard’s buffer is nearly empty, resulting in minimal extra data transfer. Worst‑case scenario: the proxy pulls 2 MB × 50 + 100 w rows, still without disk usage.

Prerequisites and Limitations

1) Shard nodes must support ordered queries (most modern databases do).

2) Shards need to allow batch fetching, e.g., MySQL can use streaming or cursor queries.

References

JDBC操作MySQL(3)—查询

MySQL JDBC StreamResult通信原理浅析

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Memory ManagementProxydistributed databasesSorting
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.