Why Cross‑Shard Queries Slow Down: Sharding‑Sphere Performance Test Insights
This article presents a practical performance test of Sharding‑Sphere using MySQL, showing how cross‑shard queries incur significant latency due to parallel routing and result merging, and discusses the underlying thread‑pool execution model and its impact on OLTP versus OLAP workloads.
In a previous article the author introduced the three main query patterns for e‑commerce order tables—by order ID, merchant ID, and user ID—and argued that all three need separate sharding; using only one column such as user ID is insufficient.
Test Environment
128 tables: image_${0..127} Database server: 32 CPU / 64 GB RAM
MySQL version: 5.7.23
OS: CentOS 6.9
Connection pool: Druid 1.1.6
JDBC driver: mysql‑connector‑java 6.0.5
MyBatis 3.4.5
MyBatis‑Spring 1.3.1
Spring Boot 1.5.9.RELEASE
Sharding‑Sphere 3.1.0
JVM options: -Xmx2g -Xms2g -Xmn1g -Xss256k -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=512m -XX:+AlwaysPreTouch
Druid default configuration
Table Definition
-- id is the sharding key. DDL is pseudo‑SQL
CREATE TABLE `image_${0..127}` (
`id` varchar(32) NOT NULL,
`image_no` varchar(50) NOT NULL,
`file_name` varchar(200) NOT NULL COMMENT 'image file name',
`source` varchar(32) DEFAULT NULL COMMENT 'image source',
`create_date` datetime DEFAULT CURRENT_TIMESTAMP COMMENT 'image creation time',
PRIMARY KEY (`id`),
KEY `idx_image_no` (`image_no`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;Test Scenario 1
Each table holds about 1.6 million rows.
10 000 queries that join on imageNo (cross‑shard) while also using the sharding key id.
Result:
Conclusion: Cross‑shard queries perform far worse than queries that include the sharding key.
Test Scenario 2
Each table holds about 1.6 million rows.
10 000 queries testing across 1, 8, 16, 32, 64, and 128 tables.
Result:
Conclusion: The more tables a query spans, the worse its performance.
Why Cross‑Shard Queries Are Slow
Sharding‑Sphere routes a cross‑shard query by concurrently sending sub‑queries to all matching tables via a thread pool, then merges the results. When the routing result contains only one shard, the engine executes the query synchronously to avoid thread overhead (see ShardingExecuteEngine.java).
Because the execution itself is fast, the bottleneck lies in the result‑merging phase. To verify this, the author benchmarked the underlying concurrency library (Google Guava MoreExecutors) used by Sharding‑Sphere.
Conclusion: As the level of concurrent execution increases, the cost of merging results grows dramatically.
Relevant Source Code
public class ConcurrentExecutorTest {
private static final ListeningExecutorService executorService;
public static final int CONCURRENT_COUNT = 64;
public static final int batchSize = CONCURRENT_COUNT;
public static final int EXECUTOR_SIZE = 8;
static {
executorService = MoreExecutors.listeningDecorator(Executors.newFixedThreadPool(EXECUTOR_SIZE));
MoreExecutors.addDelayedShutdownHook(executorService, 60, TimeUnit.SECONDS);
}
private static <I, O> List<O> execute(final Collection<I> inputs) {
if (inputs.isEmpty()) {
return Collections.emptyList();
}
// concurrent execution
Collection<ListenableFuture<O>> allFutures = asyncExecute(inputs);
// result merging
return getResults(allFutures);
}
private static <I, O> Collection<ListenableFuture<O>> asyncExecute(final Collection<I> inputs) {
Collection<ListenableFuture<O>> result = new ArrayList<>(inputs.size());
for (final I each : inputs) {
// async execution returns result directly
result.add(executorService.submit(() -> (O) each));
}
return result;
}
private static <O> List<O> getResults(final Collection<ListenableFuture<O>> allFutures) {
List<O> result = new LinkedList<>();
for (ListenableFuture<O> each : allFutures) {
result.add(each.get());
}
return result;
}
}Final Thoughts
Although cross‑shard queries are costly, Sharding‑Sphere remains valuable as a generic sharding middleware that supports any SQL, making it suitable for low‑frequency OLAP or administrative operations where occasional full‑table scans are acceptable. In such scenarios the middleware can reduce development effort and accelerate time‑to‑market without introducing additional search systems.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
