Improving Million-Scale Data Insertion Efficiency with Spring Boot ThreadPoolTaskExecutor
This article demonstrates how to boost the performance of inserting over two million records by configuring a ThreadPoolTaskExecutor in a Spring Boot application, providing detailed implementation code, test results comparing multithreaded and single‑threaded approaches, and practical recommendations for optimal thread pool sizing.
Purpose : Increase the efficiency of inserting massive amounts of data (over one million rows) by leveraging multithreaded batch insertion using Spring Boot, MyBatis‑Plus, PostgreSQL, and ThreadPoolTaskExecutor .
Implementation Details :
application-dev.properties – Thread pool configuration
# 异步线程配置
# 配置核心线程数
async.executor.thread.core_pool_size = 30
# 配置最大线程数
async.executor.thread.max_pool_size = 30
# 配置队列大小
async.executor.thread.queue_capacity = 99988
# 配置线程池中的线程的名称前缀
async.executor.thread.name.prefix = async-importDB-Spring container bean for the thread pool
@Configuration
@EnableAsync
@Slf4j
public class ExecutorConfig {
@Value("${async.executor.thread.core_pool_size}")
private int corePoolSize;
@Value("${async.executor.thread.max_pool_size}")
private int maxPoolSize;
@Value("${async.executor.thread.queue_capacity}")
private int queueCapacity;
@Value("${async.executor.thread.name.prefix}")
private String namePrefix;
@Bean(name = "asyncServiceExecutor")
public Executor asyncServiceExecutor() {
log.warn("start asyncServiceExecutor");
ThreadPoolTaskExecutor executor = new VisiableThreadPoolTaskExecutor();
executor.setCorePoolSize(corePoolSize);
executor.setMaxPoolSize(maxPoolSize);
executor.setQueueCapacity(queueCapacity);
executor.setThreadNamePrefix(namePrefix);
executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
executor.initialize();
return executor;
}
}Asynchronous service implementation
@Service
@Slf4j
public class AsyncServiceImpl implements AsyncService {
@Override
@Async("asyncServiceExecutor")
public void executeAsync(List
logOutputResults, LogOutputResultMapper logOutputResultMapper, CountDownLatch countDownLatch) {
try {
log.warn("start executeAsync");
// asynchronous work
logOutputResultMapper.addLogOutputResultBatch(logOutputResults);
log.warn("end executeAsync");
} finally {
countDownLatch.countDown(); // ensure latch release
}
}
}Batch insertion logic using multiple threads
@Override
public int testMultiThread() {
List
logOutputResults = getTestData();
// split into sub‑lists of 100 records each
List
> lists = ConvertHandler.splitList(logOutputResults, 100);
CountDownLatch countDownLatch = new CountDownLatch(lists.size());
for (List
listSub : lists) {
asyncService.executeAsync(listSub, logOutputResultMapper, countDownLatch);
}
try {
countDownLatch.await(); // wait for all threads to finish
} catch (Exception e) {
log.error("阻塞异常:" + e.getMessage());
}
return logOutputResults.size();
}Test Results :
2000003 records inserted with 30 threads: 1.67 minutes
Same data inserted with a single thread: 5.75 minutes
Various thread‑count experiments show diminishing returns after a certain point.
Conclusion : Multithreaded insertion dramatically reduces processing time, but the optimal thread count follows the rule of thumb CPU cores * 2 + 2 . The article also confirms data integrity and absence of duplicate inserts.
Test environment: Spring Boot 2.1.1, MyBatis‑Plus 3.0.6, PostgreSQL, and a machine with the specifications shown in the accompanying screenshots.
Java Architect Essentials
Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.