Backend Development 19 min read

Boosting Throughput 10×: Architecture Evolution and Tuning for High‑Concurrency Batch Processing

The article details how a sluggish batch‑processing system handling millions of records was redesigned with XXL‑JOB sharding, Redis‑based dynamic task distribution, cursor pagination, and selective transaction scopes, achieving nearly ten‑fold throughput improvement while addressing resource contention, load‑balancing, and reliable result reconciliation.

Shepherd Advanced Notes

Jun 24, 2026

Boosting Throughput 10×: Architecture Evolution and Tuning for High‑Concurrency Batch Processing

Problem

The system processed tens of thousands to millions of records in a single batch using a single‑node serial flow. This caused frequent OOM, high CPU usage, saturated DB connection pools and deep pagination that blocked the database.

Architecture Evolution

Initial design

Front‑end submitted a massive request, back‑end accepted it asynchronously, and a single worker processed the data sequentially. Horizontal scaling was impossible.

Distributed sharding with XXL‑JOB

XXL‑JOB’s sharding‑broadcast strategy distributes a task to all registered executor nodes. Each node receives shardingIndex (current slice) and shardingTotal (total slices) via int nodeIndex = XxlJobHelper.getShardIndex(); and int nodeTotal = XxlJobHelper.getShardTotal();. The SQL routing uses an ID modulo:

SELECT * FROM tb_data
WHERE field = 'queryCondition'
  AND MOD(id, #{shardingTotal}) = #{shardingIndex}

XXL‑JOB automatically updates shardingTotal when new healthy executor nodes register, enabling seamless horizontal scaling.

Sharding logic 1.0 – ID modulo

Data IDs are partitioned by id % total = index. This works when IDs are evenly distributed but can create hot spots if the distribution is skewed and may bypass index usage.

Sharding logic 2.0 – Redis list of IDs

All IDs for a task are stored in a Redis Set, sorted, and each node selects its slice by index modulo:

List<Long> sharding(Long taskId) {
    List<Long> idList = redisUtil.sGet(KeyCache.BATCH_SEND_PDF_TASK_LIST + taskId)
        .stream()
        .map(Long::parseLong)
        .sorted()
        .collect(Collectors.toList());
    int shardIndex = XxlJobHelper.getShardIndex();
    int shardTotal = XxlJobHelper.getShardTotal();
    List<Long> dataIds = new ArrayList<>();
    for (int i = 0; i < idList.size(); i++) {
        if (i % shardTotal == shardIndex) {
            dataIds.add(idList.get(i));
        }
    }
    return dataIds;
}

This approach avoids full table scans and keeps sharding logic lightweight.

Sharding logic 3.0 – Dynamic load‑balancing via Redis list

All IDs are pushed into a Redis List. Each node repeatedly pops a batch (e.g., 200) from the tail; faster nodes consume more IDs, achieving natural load‑balancing and eliminating shard‑size turbulence during scaling:

while (true) {
    List<String> detailIds = redisTemplate.execute(popLuaScript,
        Collections.singletonList(redisKey), String.valueOf(batchSize));
    if (CollectionUtils.isEmpty(detailIds)) {
        break; // all IDs consumed
    }
    // process the batch ...
}

In‑node parallel processing

After sharding, each node partitions its data list into batches (e.g., 200 records) and processes them concurrently with CompletableFuture:

List<List<Data>> partitions = Lists.partition(dataList, 200);
List<CompletableFuture<PartitionResult>> futures = partitions.stream()
    .map(subList -> CompletableFuture.supplyAsync(() ->
        doSendLetter(param, subList, session, orgSwitch), threadExecutor))
    .collect(Collectors.toList());
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();

Concurrency is tuned based on queue depth, average latency, downstream error rate and resource usage rather than being unbounded.

Batch‑oriented I/O reduction

Frequent single‑record DB queries or remote calls dominate latency. The recommended batch size is 200‑500 records per IN clause or bulk insert, which reduces round‑trips and lock contention.

Result aggregation and final state convergence

Updating the task status per record causes row‑level lock contention. The pattern aggregates results locally and writes back once per node or per batch:

long successTotal = 0;
long failTotal = 0;
for (CompletableFuture<PartitionResult> f : futures) {
    PartitionResult r = f.getNow(new PartitionResult(0, 0));
    successTotal += r.getSuccessCount();
    failTotal += r.getFailCount();
}
handleTaskProgress(taskId, successTotal, failTotal);

Two concrete approaches for the final status transition are provided:

Solution A – Redis counter : Store the total expected count in Redis (e.g., flow:task:total:{taskId}=5000). Each node decrements the counter after processing. When the counter reaches zero, the node updates MySQL status='COMPLETED' and deletes the Redis key.

Solution B – Optimistic‑lock SQL : Execute an UPDATE that sets status='COMPLETED' only when (success_count + fail_count) = total_count. Only the last node succeeds ( affected_rows > 0).

UPDATE task_main
SET success_count = success_count + #{successInc},
    fail_count = fail_count + #{failInc},
    updated_at = NOW()
WHERE id = #{taskId};

-- final status attempt
UPDATE task_main
SET status = 'COMPLETED', updated_at = NOW()
WHERE id = #{taskId}
  AND (success_count + fail_count) = total_count;

Database‑level optimizations

Traditional LIMIT offset, size pagination on massive tables triggers full‑table scans. The redesign switches to cursor‑based pagination using the last maximum primary key:

SELECT * FROM tb_data
WHERE id > #{lastMaxId}
ORDER BY id ASC
LIMIT #{pageSize};

Only minimal updates (status, audit) are wrapped in @Transactional blocks, keeping transaction scopes tiny and reducing lock duration.

Outcome

The combined redesign—horizontal sharding with XXL‑JOB, Redis‑driven dynamic distribution, in‑node multi‑threaded batch processing, batch‑oriented I/O, and lightweight transaction scopes—raised overall throughput by nearly tenfold while preserving correctness and providing clear paths for scaling, failure recovery and high availability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Sharding Redis Batch Processing Performance Tuning MySQL Xxl-Job

Written by

Shepherd Advanced Notes

Dedicated to sharing advanced Java technical insights, daily work snippets, and the power of persistent effort.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.