How Druid’s PreparedStatement Cache Breaks ShardingSphere Routing and Fixes
An unexpected data loss issue in a ShardingSphere‑JDBC setup was traced to Druid’s poolPreparedStatements cache, which reuses PreparedStatement objects across different shards, bypassing dynamic routing; disabling the cache or switching to HikariCP restores correct sharding behavior with minimal performance impact.
Problem: Data written to wrong shard, queries return empty
On a Friday afternoon, a tester reported that an order for user ID 1001 could not be found even though the backend log showed a successful write. The system uses ShardingSphere‑JDBC with Druid connection pool, sharding on user_id across four databases. Theoretically, user_id=1001 should route to db1, but the query on db1 returned no record.
@me Data mismatch! User ID 1001 order not found, but backend log shows write succeeded.
Further symptoms:
The same user’s orders were scattered across different databases at different times.
After restarting the application the routing seemed correct, but later became chaotic again.
SQL execution logs showed success, yet data was “missing”.
Initial suspicion was that the ShardingSphere sharding algorithm was wrong.
First investigation: suspect sharding logic
public final class ModShardingTableAlgorithm implements StandardShardingAlgorithm<Long> {
@Override
public String doSharding(Collection<String> availableTargetNames, PreciseShardingValue<Long> shardingValue) {
long userId = shardingValue.getValue();
int dbIndex = (int) (userId % 4);
return "orders_" + dbIndex;
}
}The logic is clear: 1001 % 4 = 1 should go to orders_1, but the data appeared in orders_3.
Logging the routing result revealed:
Earlier executions correctly routed to orders_1 , but later user_id=1005 (should route to orders_1 ) also wrote to orders_1 , while user_id=1002 wrote to orders_3 !
This was not a sharding algorithm bug; the connection had become “fixed”.
Insight: PreparedStatement cache vs dynamic routing
Druid has a configuration:
druid:
poolPreparedStatements: trueEnabling this caches PreparedStatement objects on the physical connection to improve performance. In a sharding scenario, ShardingSphere’s workflow is:
SQL execution -> Parse -> Compute sharding -> Get target datasource -> Get connection -> Create PreparedStatement -> ExecuteIf Druid caches the PreparedStatement, the following happens:
First execution: user_id=1001 → routes to db1 → creates and caches a PreparedStatement on conn‑db1 .
Second execution: user_id=1002 (should route to db2 ) → Druid finds a reusable PreparedStatement on conn‑db1 → reuses it → SQL runs on db1 !
This is the root cause: the cache bypasses ShardingSphere’s routing decision, causing all subsequent SQL to execute on the initially cached connection.
Deep analysis: why poolPreparedStatements=true is “poison” in sharding
1. Conflict of design goals
Design goal: Druid cache aims to boost performance by reducing SQL parsing overhead; ShardingSphere aims for horizontal data scaling.
Cache granularity: Druid caches per physical connection, while sharding middleware caches per SQL + sharding key.
Execution: Druid reuses compiled SQL templates; ShardingSphere determines the target datasource on each execution.
Conflict: Druid’s cache locks the connection, preventing dynamic data‑source switching required by sharding.
Conclusion: the two design philosophies clash, and the cache destroys routing dynamism.
2. Druid cache mechanism diagram
[App] → Execute SQL: INSERT INTO orders VALUES(?, ?)
[ShardingSphere] → Parse → Route to db1
↓ Get connection conn‑db1
[Druid] → Check if conn‑db1 has cached PreparedStatement
→ Yes? Reuse! → Execute (even if next should go to db2)
→ No? Create and cacheOnce the cache is established, all later identical SQL templates reuse the same connection’s PreparedStatement, completely bypassing routing decisions.
3. Why it works after a restart
Restarting rebuilds the connection pool and clears the cache, so the first execution routes correctly. As soon as a connection caches a PreparedStatement, it becomes a “black hole” that absorbs subsequent requests.
Ultimate solution: three‑pronged fix
Solution 1: Disable Druid’s PreparedStatement cache (recommended)
# application.yml
spring:
datasource:
druid:
pool-prepared-statements: false
max-pool-prepared-statement-per-connection-size: -1This is the simplest and safest approach. In a sharding scenario, correct routing outweighs the marginal performance gain from caching.
Solution 2: Switch to HikariCP (more elegant)
spring:
datasource:
type: com.zaxxer.hikari.HikariDataSource
hikari:
# preparedStatementCacheSize: 256 (if you really need caching)Solution 3: Monitoring + alerts
druid:
connection-properties: druid.stat.mergeSql=true;druid.stat.slowSqlMillis=1 logging:
level:
org.apache.shardingsphere: DEBUGPerformance impact assessment
Druid + poolPreparedStatements=true: throughput 8,200 ops, avg latency 12.1 ms.
Druid + poolPreparedStatements=false: throughput 7,950 ops, avg latency 12.6 ms.
HikariCP (default): throughput 8,100 ops, avg latency 12.3 ms.
Disabling the cache reduces performance by less than 4%, but guarantees data consistency—well worth it.
Why many “best practices” still enable it
Most Druid best‑practice guides were written for single‑database scenarios, where enabling poolPreparedStatements can improve performance by over 10 %. In a sharding architecture, that same setting becomes a “best accident maker”.
Architecture evolves; configuration must keep pace.
Takeaways for Java engineers
Sharding + Druid → must set poolPreparedStatements to false.
Connection‑pool configuration is not one‑size‑fits‑all; evaluate with your middleware.
In production, suspect configuration before blaming code.
Performance optimization must never sacrifice correctness.
HikariCP + ShardingSphere is currently the most stable combination.
Deep thoughts: upgrading cognition for sharding architectures
1. From performance optimization to architectural coordination
Sharding is not just “splitting tables”; it is a dynamic routing system. Any connection‑pool, cache, or transaction setting must cooperate with the routing layer.
In distributed systems there is no “independent optimization”, only global coordination.
2. Why the issue is hidden
Triggers only on specific SQL patterns where the sharding key changes.
Appears under high concurrency when cache hit rate is high.
No error is thrown—data is merely misplaced, making diagnosis difficult.
3. Golden rule
Any performance tweak must pass a “routing consistency test”, otherwise it is a ticking time bomb.
Pre‑sharding checklist
Connection‑pool: ensure poolPreparedStatements is false (★★★★★).
Middleware compatibility: verify ShardingSphere + Druid official compatibility list (★★★★).
SQL execution mode: avoid using PreparedStatement templates that bypass sharding (★★★).
Monitoring & alerts: add SQL routing error monitoring (★★★★).
Data validation: regularly run shard data‑consistency checks (★★★★).
Conclusion: not a technical problem but a cognition problem
The root cause of sharding failures is often not the technology itself but our limited understanding of distributed systems. Chasing performance while ignoring overall system coordination inevitably leads to costly mistakes.
Remember: correct routing is more important than a 1 % query‑speed gain.
Now check your Druid configuration—it may be destroying your sharding system.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Cognitive Technology Team
Cognitive Technology Team regularly delivers the latest IT news, original content, programming tutorials and experience sharing, with daily perks awaiting you.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
