Backend Development 6 min read

Improving Million-Scale Data Insertion Efficiency with Spring Boot ThreadPoolTaskExecutor

This article demonstrates how to boost the performance of inserting over two million records by configuring a ThreadPoolTaskExecutor in a Spring Boot application, providing detailed implementation code, test results comparing multithreaded and single‑threaded approaches, and practical recommendations for optimal thread pool sizing.

Java Architect Essentials

Apr 19, 2024

Improving Million-Scale Data Insertion Efficiency with Spring Boot ThreadPoolTaskExecutor

Purpose : Increase the efficiency of inserting massive amounts of data (over one million rows) by leveraging multithreaded batch insertion using Spring Boot, MyBatis‑Plus, PostgreSQL, and ThreadPoolTaskExecutor.

Implementation Details :

application-dev.properties – Thread pool configuration

# 异步线程配置
# 配置核心线程数
async.executor.thread.core_pool_size = 30
# 配置最大线程数
async.executor.thread.max_pool_size = 30
# 配置队列大小
async.executor.thread.queue_capacity = 99988
# 配置线程池中的线程的名称前缀
async.executor.thread.name.prefix = async-importDB-

Spring container bean for the thread pool

@Configuration
@EnableAsync
@Slf4j
public class ExecutorConfig {
    @Value("${async.executor.thread.core_pool_size}")
    private int corePoolSize;
    @Value("${async.executor.thread.max_pool_size}")
    private int maxPoolSize;
    @Value("${async.executor.thread.queue_capacity}")
    private int queueCapacity;
    @Value("${async.executor.thread.name.prefix}")
    private String namePrefix;

    @Bean(name = "asyncServiceExecutor")
    public Executor asyncServiceExecutor() {
        log.warn("start asyncServiceExecutor");
        ThreadPoolTaskExecutor executor = new VisiableThreadPoolTaskExecutor();
        executor.setCorePoolSize(corePoolSize);
        executor.setMaxPoolSize(maxPoolSize);
        executor.setQueueCapacity(queueCapacity);
        executor.setThreadNamePrefix(namePrefix);
        executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
        executor.initialize();
        return executor;
    }
}

Asynchronous service implementation

@Service
@Slf4j
public class AsyncServiceImpl implements AsyncService {
    @Override
    @Async("asyncServiceExecutor")
    public void executeAsync(List<LogOutputResult> logOutputResults, LogOutputResultMapper logOutputResultMapper, CountDownLatch countDownLatch) {
        try {
            log.warn("start executeAsync");
            // asynchronous work
            logOutputResultMapper.addLogOutputResultBatch(logOutputResults);
            log.warn("end executeAsync");
        } finally {
            countDownLatch.countDown(); // ensure latch release
        }
    }
}

Batch insertion logic using multiple threads

@Override
public int testMultiThread() {
    List<LogOutputResult> logOutputResults = getTestData();
    // split into sub‑lists of 100 records each
    List<List<LogOutputResult>> lists = ConvertHandler.splitList(logOutputResults, 100);
    CountDownLatch countDownLatch = new CountDownLatch(lists.size());
    for (List<LogOutputResult> listSub : lists) {
        asyncService.executeAsync(listSub, logOutputResultMapper, countDownLatch);
    }
    try {
        countDownLatch.await(); // wait for all threads to finish
    } catch (Exception e) {
        log.error("阻塞异常:" + e.getMessage());
    }
    return logOutputResults.size();
}

Test Results :

2000003 records inserted with 30 threads: 1.67 minutes

Same data inserted with a single thread: 5.75 minutes

Various thread‑count experiments show diminishing returns after a certain point.

Conclusion : Multithreaded insertion dramatically reduces processing time, but the optimal thread count follows the rule of thumb CPU cores * 2 + 2. The article also confirms data integrity and absence of duplicate inserts.

Test environment: Spring Boot 2.1.1, MyBatis‑Plus 3.0.6, PostgreSQL, and a machine with the specifications shown in the accompanying screenshots.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

springboot PostgreSQL PerformanceTesting ThreadPoolTaskExecutor BatchInsert

Written by

Java Architect Essentials

Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.