Improving Million-Scale Data Insertion Efficiency with Spring Boot ThreadPoolTaskExecutor

This article demonstrates how to boost the insertion speed of over two million records by configuring a Spring Boot ThreadPoolTaskExecutor for multithreaded batch inserts, detailing the setup, code implementation, performance testing, and analysis of optimal thread counts.

Java Architect Essentials
Java Architect Essentials
Java Architect Essentials
Improving Million-Scale Data Insertion Efficiency with Spring Boot ThreadPoolTaskExecutor

The goal of the tutorial is to increase the efficiency of inserting data at a million‑scale level.

The chosen solution is to use ThreadPoolTaskExecutor for multithreaded batch insertion within a Spring Boot 2.1.1 application.

Key technologies employed include Spring Boot, MyBatis‑Plus, Swagger, Lombok, PostgreSQL, and the ThreadPoolTaskExecutor itself.

Implementation Details

Thread pool configuration (application-dev.properties)

# 异步线程配置
# 配置核心线程数
async.executor.thread.core_pool_size = 30
# 配置最大线程数
async.executor.thread.max_pool_size = 30
# 配置队列大小
async.executor.thread.queue_capacity = 99988
# 配置线程池中的线程的名称前缀
async.executor.thread.name.prefix = async-importDB-

Spring injects the thread‑pool bean as follows:

@Configuration
@EnableAsync
@Slf4j
public class ExecutorConfig {
    @Value("${async.executor.thread.core_pool_size}")
    private int corePoolSize;
    @Value("${async.executor.thread.max_pool_size}")
    private int maxPoolSize;
    @Value("${async.executor.thread.queue_capacity}")
    private int queueCapacity;
    @Value("${async.executor.thread.name.prefix}")
    private String namePrefix;

    @Bean(name = "asyncServiceExecutor")
    public Executor asyncServiceExecutor() {
        log.warn("start asyncServiceExecutor");
        ThreadPoolTaskExecutor executor = new VisiableThreadPoolTaskExecutor();
        executor.setCorePoolSize(corePoolSize);
        executor.setMaxPoolSize(maxPoolSize);
        executor.setQueueCapacity(queueCapacity);
        executor.setThreadNamePrefix(namePrefix);
        executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
        executor.initialize();
        return executor;
    }
}

The asynchronous service that performs the actual batch insert:

@Service
@Slf4j
public class AsyncServiceImpl implements AsyncService {
    @Override
    @Async("asyncServiceExecutor")
    public void executeAsync(List<LogOutputResult> logOutputResults, LogOutputResultMapper logOutputResultMapper, CountDownLatch countDownLatch) {
        try {
            log.warn("start executeAsync");
            logOutputResultMapper.addLogOutputResultBatch(logOutputResults);
            log.warn("end executeAsync");
        } finally {
            countDownLatch.countDown(); // ensure latch release even on exception
        }
    }
}

The method that drives the multithreaded insertion test:

@Override
public int testMultiThread() {
    List<LogOutputResult> logOutputResults = getTestData();
    // split into sub‑lists of 100 records each
    List<List<LogOutputResult>> lists = ConvertHandler.splitList(logOutputResults, 100);
    CountDownLatch countDownLatch = new CountDownLatch(lists.size());
    for (List<LogOutputResult> listSub : lists) {
        asyncService.executeAsync(listSub, logOutputResultMapper, countDownLatch);
    }
    try {
        countDownLatch.await(); // wait for all threads to finish
    } catch (Exception e) {
        log.error("阻塞异常:" + e.getMessage());
    }
    return logOutputResults.size();
}

Performance testing with 2,000,003 records showed:

Multithreaded execution (30 threads) completed in 1.67 minutes.

Single‑threaded execution took 5.75 minutes.

Additional tests with varying thread counts confirmed that more threads do not always mean better performance; an empirical rule of thumb is CPU cores × 2 + 2 threads for optimal throughput.

Conclusion

The experiment demonstrates that using a properly configured ThreadPoolTaskExecutor can reduce bulk‑insert time by roughly threefold, while also ensuring data integrity and avoiding duplicate inserts.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ThreadPoolmultithreading
Java Architect Essentials
Written by

Java Architect Essentials

Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.