Unveiling Sharding-Proxy: Transparent Database Sharding in Apache ShardingSphere
This talk walks through Sharding-Proxy’s role within Apache ShardingSphere, detailing its architecture, SQL lifecycle from protocol decoding to routing, rewriting, execution, and result merging, while highlighting core principles, performance bottlenecks, and practical optimization techniques for high‑throughput, low‑latency database proxying.
Sharding-Proxy Overview
Sharding-Proxy is a transparent database proxy component of Apache ShardingSphere, positioned as a protocol‑level gateway that supports MySQL and PostgreSQL clients. It encapsulates the binary protocol, providing unified access to sharding, distributed transactions, and governance features.
SQL Lifecycle in Sharding-Proxy
1. MySQL Protocol Decoding
Incoming packets are captured (e.g., via Wireshark) on the default port 3307. The protocol follows the TLV format: TYPE=Query, LENGTH=58, VALUE=SQL ASCII bytes.
2. SQL Parsing
SQL is parsed into an abstract syntax tree (AST) generated by Antlr. Lexical analysis tokenizes the statement, then syntactic analysis builds the AST, enabling extraction of sharding keys and rewrite points.
SELECT * FROM t_order WHERE user_id = 10 AND order_id = 1;3. Routing
Based on sharding rules (e.g., modulo), the logical query is routed to physical databases and tables. Example: user_id % 2 → ds_0, order_id % 2 → t_order_1. ShardingSphere supports multiple strategies such as hash, range, and time‑based routing.
4. SQL Rewriting
The logical SQL is rewritten into a concrete statement that can be executed on the target shards. Complex cases include aggregation, LIMIT, and column supplementation.
5. Execution
The rewritten SQL is sent through a HikariCP connection pool to the underlying MySQL instances. The execution engine may use streaming result sets to reduce memory pressure.
6. Result Merging
Results from multiple shards are merged. Simple merge concatenates rows; more advanced merges handle ORDER BY, GROUP BY, and pagination while minimizing memory usage.
7. MySQL Protocol Encoding
The merged result set is encoded back into the MySQL protocol and transmitted to the client, including column metadata.
Core Principles
IO & Thread Model
The front‑end uses Netty with NIO (epoll on Linux, NIO on Windows/macOS). BossGroup acts as the reactor, WorkerGroup processes client requests, and a UserExecutorGroup handles MySQL command execution. The back‑end relies on HikariCP for synchronous DB access. XA transactions require thread‑local resources.
Streaming Merge
Streaming merge processes rows as they arrive, requiring a streaming ResultSet on the JDBC side. This reduces memory consumption but depends on client consumption speed.
Performance Optimizations
Memory Leak Mitigation
When many JDBC clients fetch large result sets, Proxy buffers can grow unchecked, leading to OOM. Enabling streaming ResultSet (fetchSize) or cursor‑based fetching limits buffer growth.
Flow Control (Back‑pressure)
Netty’s ChannelOutboundBuffer can become unbounded if the client is slow. Setting high and low watermarks stops reading from MySQL when the buffer exceeds the high watermark, preventing memory exhaustion.
Distributed Tracing
SkyWalking agents instrument Sharding‑Proxy without code changes, providing end‑to‑end SQL call‑chain visibility.
Common Code Pitfalls
Using LinkedList with index‑based access leads to O(n²) complexity; replace with foreach iteration.
Storing global configuration in java.util.Properties causes synchronized access bottlenecks; prefer ConcurrentHashMap.
Unintended extra SQL (e.g., SELECT user()) doubles query count; caching usernames can improve performance by ~30%.
IO & System Calls
Netty’s write() followed by flush() for each row incurs high system‑call overhead. Batching flushes reduces TCP packet overhead and can improve throughput by up to 50%.
Full‑Route Issues
Incorrect parsing may cause a query intended for a single shard to be routed to all shards, exploding response time. Tools like JMC, JProfiler, or APM solutions (SkyWalking, SGM) help locate such anomalies.
Testing & Monitoring
Performance regressions are caught by automated PR tests and daily long‑duration stress tests (~10 h). Build numbers link to GitHub commits for quick root‑cause analysis.
Sharding‑Proxy has matured after more than a year of production use and is ready for adoption in real‑world workloads.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
