Databases 55 min read

Sharding-JDBC Source Code Analysis and Custom Development

The article dissects Sharding‑JDBC’s five core engines—parsing, routing, rewriting, execution, and merging—highlights production pain points, and details custom extensions such as skipping sharding for specific tables, forcing master routing, runtime configuration refresh, batch‑update handling, sharding‑condition deduplication, full‑route validation, and a simplified component wrapper to ease integration and improve performance.

vivo Internet Technology
vivo Internet Technology
vivo Internet Technology
Sharding-JDBC Source Code Analysis and Custom Development

Author: vivo IT Platform Team - Xiong Huanxin

Sharding-JDBC is a JDBC‑layer database middleware widely used in sharding scenarios. This article parses the five core engines of Sharding-JDBC (parsing, routing, rewriting, execution, merging), shares pain points encountered in production, and presents custom development and refactoring solutions.

Business Background

Rapid growth of daily data in marketing inventory, transaction orders, finance ledgers, and attendance records creates pressure on single‑node databases. By distributing unrelated data across multiple databases and tables, system pressure is reduced and query performance improved.

Technical Selection

Sharding-JDBC was chosen over other middleware for its lightweight nature, better performance, and lower integration difficulty. However, several limitations were discovered during use.

1. Parsing Engine

The parsing engine tokenizes SQL into an abstract syntax tree (AST) which is the basis for routing and rewriting. Example of JDBC execution code:

//获取数据库连接
try (Connection conn = DriverManager.getConnection("mysqlUrl", "userName", "password")) {
String sql = "SELECT * FROM  t_user WHERE name = ?";
//预编译SQL
try (PreparedStatement preparedStatement = conn.prepareStatement(sql)) {
preparedStatement.setString(1, "vivo");
preparedStatement.execute(sql);
try (ResultSet resultSet = preparedStatement.getResultSet()) {
while (resultSet.next()) {
//处理结果
}
}
}
}

The core parsing method:

org.apache.shardingsphere.sql.parser.SQLParserEngine#parse0
private SQLStatement parse0(final String sql, final boolean useCache) {
ParseTree parseTree = new SQLParserExecutor(databaseTypeName, sql).execute().getRootNode();
SQLStatement result ;
if(RuleContextManager.isSkipSharding() && !VisitorRule.SELECT.equals(VisitorRule.valueOf(parseTree.getClass()))){
RuleContextManager.setMasterRoute(true);
result = new SkipShardingStatement();
} else {
result = (SQLStatement) ParseTreeVisitorFactory.newInstance(databaseTypeName, VisitorRule.valueOf(parseTree.getClass())).visit(parseTree);
}
return result;
}

2. Routing Engine

Routing determines which physical data nodes a SQL should be executed on. Example of route result creation:

org.apache.shardingsphere.sharding.route.engine.ShardingRouteDecorator#decorate
private RouteContext decorate(final RouteContext routeContext, final ShardingSphereMetaData metaData, final ShardingRule shardingRule, final ConfigurationProperties properties) {
ShardingConditions shardingConditions = getShardingConditions(parameters, sqlStatementContext, metaData.getSchema(), shardingRule);
// ... routing logic ...
return new RouteContext(sqlStatementContext, parameters, routeResult);
}

Sharding and master‑slave routing are combined when needed.

3. Rewrite Engine

After routing, logical table names are rewritten to physical ones and the SQL is split for each target node.

org.apache.shardingsphere.sharding.rewrite.SQLRewriteEntry#createSQLRewriteContext
public SQLRewriteContext createSQLRewriteContext(final String sql, final List
parameters, final SQLStatementContext sqlStatementContext, final RouteContext routeContext) {
SQLRewriteContext result = new SQLRewriteContext(schemaMetaData, sqlStatementContext, sql, parameters);
result.generateSQLTokens();
return result;
}

The rewrite result is turned into ExecutionUnit objects:

public final class ExecutionUnit {
private final String dataSourceName;
private final SQLUnit sqlUnit;
}

4. Execution Engine

Execution groups SQL statements by data source and creates connections according to maxConnectionsSizePerQuery . Connection mode (STRICT or MEMORY) determines whether connections are created per SQL or shared.

org.apache.shardingsphere.sharding.execute.sql.prepare.SQLExecutePrepareTemplate#getSQLExecuteGroups
private List
> getSQLExecuteGroups(final String dataSourceName, final List
sqlUnits, final SQLExecutePrepareCallback callback) throws SQLException {
int desiredPartitionSize = Math.max(0 == sqlUnits.size() % maxConnectionsSizePerQuery ? sqlUnits.size() / maxConnectionsSizePerQuery : sqlUnits.size() / maxConnectionsSizePerQuery + 1, 1);
List
> sqlUnitPartitions = Lists.partition(sqlUnits, desiredPartitionSize);
ConnectionMode connectionMode = maxConnectionsSizePerQuery < sqlUnits.size() ? ConnectionMode.CONNECTION_STRICTLY : ConnectionMode.MEMORY_STRICTLY;
List
connections = callback.getConnections(connectionMode, dataSourceName, sqlUnitPartitions.size());
// ... create StatementExecuteUnit for each partition ...
}

Result merging is performed based on the SQL type.

5. Merge Engine

Merge combines result sets from multiple nodes. For SELECT statements, ShardingDQLResultMerger handles grouping, ordering, and pagination.

org.apache.shardingsphere.sharding.merge.dql.ShardingDQLResultMerger#merge
public MergedResult merge(final List
queryResults, final SQLStatementContext sqlStatementContext, final SchemaMetaData schemaMetaData) throws SQLException {
if (queryResults.size() == 1) {
return new IteratorStreamMergedResult(queryResults);
}
// build group‑by, order‑by or iterator merged result
MergedResult mergedResult = build(queryResults, selectStatementContext, columnLabelIndexMap, schemaMetaData);
return decorate(queryResults, selectStatementContext, mergedResult);
}

Custom Development Highlights

1) Skip Sharding Syntax Restrictions – For non‑sharding tables the parsing, routing, rewriting, and merging steps are bypassed. The decision is stored in RuleContextManager (ThreadLocal) and the execution path returns a SkipShardingStatement and a manually constructed ExecutionUnit .

public final class RuleContextManager {
private static final ThreadLocal
SKIP_CONTEXT_HOLDER = ThreadLocal.withInitial(RuleContextManager::new);
private boolean skipSharding;
private boolean masterRoute;
public static boolean isSkipSharding() { return SKIP_CONTEXT_HOLDER.get().skipSharding; }
public static void setSkipSharding(boolean skip) { SKIP_CONTEXT_HOLDER.get().skipSharding = skip; }
}

2) Force Master Routing – A configuration property MASTER_ROUTE_ONLY sets MasterVisitedManager.setMasterVisited() so that MasterSlaveDataSourceRouter always routes to the master.

org.apache.shardingsphere.masterslave.route.engine.MasterSlaveRouteDecorator#decorate
if (properties.
getValue(ConfigurationPropertyKey.MASTER_ROUTE_ONLY)) {
MasterVisitedManager.setMasterVisited();
}

3) Dynamic Configuration Refresh – Added TypedProperties.refreshValue(String key, String value) to update configuration at runtime, exposed through the datasource’s runtime context.

public boolean refreshValue(String key, String value) {
for (E each : enumConstants) {
if (each.getKey().equals(key)) {
TypedPropertyValue typedPropertyValue = new TypedPropertyValue(each, value);
cache.put(each, typedPropertyValue);
props.put(key, value);
return true;
}
}
return false;
}

4) Batch UPDATE Support – The original prepareBatch method was extended to split multi‑statement batches, route each statement individually, and keep an EXECUTION_UNIT_LIST in ExecutionContext to preserve all units.

private ExecutionContext prepareBatch(List
splitSqlList, final List
allParameters) {
List
sqlList = splitSqlList.stream().distinct().collect(Collectors.toList());
String sql = sqlList.get(0);
Collection
globalExecutionUnitList = new ArrayList<>();
// route each parameter set
for (List
eachSqlParameterList : eachSqlParameterListList) {
RouteContext routeContext = executeRoute(sql, eachSqlParameterList);
globalExecutionUnitList.addAll(executeRewrite(sql, eachSqlParameterList, routeContext));
}
executionContextResult.getExtendMap().put(EXECUTION_UNIT_LIST, globalExecutionUnitList);
return executionContextResult;
}

5) ShardingCondition Deduplication – Added @EqualsAndHashCode to ListRouteValue and RangeRouteValue , then deduplicated the list of ShardingCondition objects.

private Collection
createShardingConditions(...){
// ... build conditions ...
Collection
distinctResult = result.stream().distinct().collect(Collectors.toCollection(LinkedList::new));
return distinctResult;
}

6) Full‑Route Validation – When ALLOW_EMPTY_SHARDING_CONDITIONS is false, the router throws an exception if a DML statement lacks sharding keys.

if (!properties.
getValue(ConfigurationPropertyKey.ALLOW_EMPTY_SHARDING_CONDITIONS)) {
if (sqlStatementContext.getSqlStatement() instanceof DMLStatement) {
if (shardingConditions.getConditions().isEmpty()) {
throw new ShardingSphereException("SQL does not contain sharding key");
}
}
}

7) Component Packaging – A wrapper component simplifies datasource configuration, sharding rule definition, and provides built‑in listeners for dynamic properties.

Usage Recommendations

• Always include sharding keys in SQL to avoid full routing. • Avoid routing a single SQL to multiple databases. • Do not apply functions or arithmetic on sharding keys. • Do not use sub‑queries on sharding tables. • CASE, HAVING, UNION(ALL) on sharding tables limit routing to a single node.

Additional best‑practice suggestions include using globally unique IDs, aligning GROUP BY and ORDER BY columns, and using incremental IDs for efficient pagination.

Conclusion

The article provides a detailed walkthrough of Sharding‑JDBC’s core engines, custom extensions to bypass syntax restrictions, force master routing, enable dynamic configuration, support batch updates, and improve performance. The customizations aim to lower integration effort and reduce the impact on existing SQL while maintaining the robustness of Sharding‑JDBC.

source code analysisDatabase MiddlewareMERGE EngineSQL RewriteExecution EngineCustom DevelopmentRouting EngineSharding-JDBC
vivo Internet Technology
Written by

vivo Internet Technology

Sharing practical vivo Internet technology insights and salon events, plus the latest industry news and hot conferences.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.