Improving Million-Row Insert Performance with Spring Boot ThreadPoolTaskExecutor
This article demonstrates how to boost the efficiency of inserting millions of records by configuring a ThreadPoolTaskExecutor in Spring Boot 2.1.1, using MyBatis‑Plus and PostgreSQL, providing code examples, performance test results, and best‑practice recommendations for multithreaded batch insertion.
Preface
Hello, I am a senior architect.
Development Purpose
Improve the efficiency of inserting data at million‑level scale.
Solution
Use ThreadPoolTaskExecutor for multithreaded batch insertion.
Technology Stack
Spring Boot 2.1.1
MyBatis‑Plus 3.0.6
Swagger 2.5.0
Lombok 1.18.4
PostgreSQL
ThreadPoolTaskExecutor
Implementation Details
Add thread‑pool configuration to application-dev.properties :
# Async thread configuration
# Core pool size
async.executor.thread.core_pool_size = 30
# Max pool size
async.executor.thread.max_pool_size = 30
# Queue capacity
async.executor.thread.queue_capacity = 99988
# Thread name prefix
async.executor.thread.name.prefix = async-importDB-Inject the thread‑pool bean into the Spring container:
@Configuration
@EnableAsync
@Slf4j
public class ExecutorConfig {
@Value("${async.executor.thread.core_pool_size}")
private int corePoolSize;
@Value("${async.executor.thread.max_pool_size}")
private int maxPoolSize;
@Value("${async.executor.thread.queue_capacity}")
private int queueCapacity;
@Value("${async.executor.thread.name.prefix}")
private String namePrefix;
@Bean(name = "asyncServiceExecutor")
public Executor asyncServiceExecutor() {
log.warn("start asyncServiceExecutor");
ThreadPoolTaskExecutor executor = new VisiableThreadPoolTaskExecutor();
executor.setCorePoolSize(corePoolSize);
executor.setMaxPoolSize(maxPoolSize);
executor.setQueueCapacity(queueCapacity);
executor.setThreadNamePrefix(namePrefix);
executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
executor.initialize();
return executor;
}
}Create the asynchronous service class:
@Service
@Slf4j
public class AsyncServiceImpl implements AsyncService {
@Override
@Async("asyncServiceExecutor")
public void executeAsync(List
logOutputResults,
LogOutputResultMapper logOutputResultMapper,
CountDownLatch countDownLatch) {
try {
log.warn("start executeAsync");
// async work
logOutputResultMapper.addLogOutputResultBatch(logOutputResults);
log.warn("end executeAsync");
} finally {
countDownLatch.countDown(); // ensure latch release
}
}
}Batch insertion method using multithreading:
@Override
public int testMultiThread() {
List
logOutputResults = getTestData();
// split every 100 records into a thread
List
> lists = ConvertHandler.splitList(logOutputResults, 100);
CountDownLatch countDownLatch = new CountDownLatch(lists.size());
for (List
listSub : lists) {
asyncService.executeAsync(listSub, logOutputResultMapper, countDownLatch);
}
try {
countDownLatch.await(); // wait for all threads
} catch (Exception e) {
log.error("Blocking exception:" + e.getMessage());
}
return logOutputResults.size();
}Tested with 2,000,003 records:
Multithreaded test (30 threads) took 1.67 minutes, while single‑threaded took 5.75 minutes.
Data integrity checks confirmed no duplicate or missing records.
Test Results
Different thread counts were evaluated; the optimal thread count follows the rule “CPU cores × 2 + 2”.
Conclusion
Multithreading dramatically reduces insertion time, but adding more threads beyond the optimal number does not yield further gains; proper thread‑pool sizing is essential for efficient bulk inserts.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.