How to Speed Up Excel-to-Database Imports with Async and Thread Pool Tuning

Learn how to efficiently import large Excel files into a database using POI, JDBC, MyBatis, and Spring Boot, with step-by-step code examples, performance bottlenecks, and optimization techniques such as caching, asynchronous processing, multi-threaded batch inserts, and thread‑pool configuration for maximum throughput.

macrozheng
macrozheng
macrozheng
How to Speed Up Excel-to-Database Imports with Async and Thread Pool Tuning

Basic Import Steps

Read the Excel file with POI.

Use the file name as the table name and the column headers as column names, then concatenate the data into SQL statements.

Insert the data into the database via JDBC or MyBatis.

Performance Issue

When processing a large file (e.g., 100,000 rows), the naive approach can take more than 190 seconds, making the operation appear frozen.

private void readXls(String filePath, String filename) throws Exception {
    @SuppressWarnings("resource")
    XSSFWorkbook xssfWorkbook = new XSSFWorkbook(new FileInputStream(filePath));
    // read first sheet
    XSSFSheet sheet = xssfWorkbook.getSheetAt(0);
    int maxRow = sheet.getLastRowNum();
    StringBuilder insertBuilder = new StringBuilder();
    insertBuilder.append("insert into ").append(filename).append(" ( UUID,");
    XSSFRow row = sheet.getRow(0);
    for (int i = 0; i < row.getPhysicalNumberOfCells(); i++) {
        insertBuilder.append(row.getCell(i)).append(",");
    }
    insertBuilder.deleteCharAt(insertBuilder.length() - 1);
    insertBuilder.append(" ) values ( ");
    StringBuilder stringBuilder = new StringBuilder();
    for (int i = 1; i <= maxRow; i++) {
        XSSFRow xssfRow = sheet.getRow(i);
        String id = "";
        String name = "";
        for (int j = 0; j < row.getPhysicalNumberOfCells(); j++) {
            if (j == 0) {
                id = xssfRow.getCell(j) + "";
            } else if (j == 1) {
                name = xssfRow.getCell(j) + "";
            }
        }
        boolean flag = isExisted(id, name);
        if (!flag) {
            stringBuilder.append(insertBuilder);
            stringBuilder.append('\'' + uuid() + '\'').append(",");
            for (int j = 0; j < row.getPhysicalNumberOfCells(); j++) {
                stringBuilder.append('\'' + value + '\'').append(",");
            }
            stringBuilder.deleteCharAt(stringBuilder.length() - 1);
            stringBuilder.append(" )
");
        }
    }
    List<String> collect = Arrays.stream(stringBuilder.toString().split("
")).collect(Collectors.toList());
    int sum = JdbcUtil.executeDML(collect);
}

private static boolean isExisted(String id, String name) {
    String sql = "select count(1) as num from " + static_TABLE + " where ID = '" + id + "' and NAME = '" + name + "'";
    String num = JdbcUtil.executeSelect(sql, "num");
    return Integer.valueOf(num) > 0;
}

private static String uuid() {
    return UUID.randomUUID().toString().replace("-", "");
}

Optimization Strategies

Cache all existing data in a map before inserting, which speeds up existence checks.

When a single Excel file is large, use asynchronous processing combined with multithreading to read rows in batches and insert them incrementally.

If there are many files, process each file in its own asynchronous task to achieve double‑async parallelism.

Async Controller Example

@RequestMapping(value = "/readExcelCacheAsync", method = RequestMethod.POST)
@ResponseBody
public String readExcelCacheAsync() {
    String path = "G:\\测试\\data\\";
    try {
        // Cache all data before reading Excel
        USER_INFO_SET = getUserInfo();
        File file = new File(path);
        String[] xlsxArr = file.list();
        for (int i = 0; i < xlsxArr.length; i++) {
            File fileTemp = new File(path + "\\" + xlsxArr[i]);
            String filename = fileTemp.getName().replace(".xlsx", "");
            readExcelCacheAsyncService.readXls(path + filename + ".xlsx", filename);
        }
    } catch (Exception e) {
        logger.error("|#ReadDBCsv|#异常: ", e);
        return "error";
    }
    return "success";
}

Thread‑Pool Configuration

Core pool size is often set to the number of CPU cores (or N+1 for CPU‑bound tasks). For I/O‑bound workloads, a size of 2N is common. Example YAML configuration:

spring:
  task:
    execution:
      pool:
        max-size: 10
        core-size: 5
        keep-alive: 3s
        queue-capacity: 1000
        thread-name-prefix: my-executor

Custom executor using Guava ThreadFactory and ThreadPoolExecutor:

@EnableAsync
@Configuration
public class AsyncTaskConfig {
    @Bean("my-executor")
    public Executor firstExecutor() {
        ThreadFactory threadFactory = new ThreadFactoryBuilder().setNameFormat("my-executor").build();
        int curSystemThreads = Runtime.getRuntime().availableProcessors() * 2;
        ThreadPoolExecutor threadPool = new ThreadPoolExecutor(curSystemThreads, 100,
                200, TimeUnit.SECONDS,
                new LinkedBlockingQueue<>(), threadFactory);
        threadPool.allowsCoreThreadTimeOut();
        return threadPool;
    }
    @Bean("async-executor")
    public Executor asyncExecutor() {
        ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
        taskExecutor.setCorePoolSize(24);
        taskExecutor.setMaxPoolSize(200);
        taskExecutor.setQueueCapacity(50);
        taskExecutor.setKeepAliveSeconds(200);
        taskExecutor.setThreadNamePrefix("async-executor-");
        taskExecutor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
        taskExecutor.initialize();
        return taskExecutor;
    }
}

EasyExcel Alternative

Using EasyExcel simplifies reading and batch inserting:

@RequestMapping(value = "/readEasyExcel", method = RequestMethod.POST)
@ResponseBody
public String readEasyExcel() {
    try {
        String path = "G:\\测试\\data\\";
        String[] xlsxArr = new File(path).list();
        for (int i = 0; i < xlsxArr.length; i++) {
            String filePath = path + xlsxArr[i];
            File fileTemp = new File(path + xlsxArr[i]);
            String fileName = fileTemp.getName().replace(".xlsx", "");
            List<UserInfo> list = new ArrayList<>();
            EasyExcel.read(filePath, UserInfo.class, new ReadEasyExeclAsyncListener(readEasyExeclService, fileName, batchCount, list))
                     .sheet().doRead();
        }
    } catch (Exception e) {
        logger.error("readEasyExcel 异常:", e);
        return "error";
    }
    return "success";
}
public class ReadEasyExeclAsyncListener implements ReadListener<UserInfo> {
    private final ReadEasyExeclService readEasyExeclService;
    private final String TABLE_NAME;
    private final int BATCH_COUNT;
    private final List<UserInfo> LIST;
    public ReadEasyExeclAsyncListener(ReadEasyExeclService service, String tableName, int batchCount, List<UserInfo> list) {
        this.readEasyExeclService = service;
        this.TABLE_NAME = tableName;
        this.BATCH_COUNT = batchCount;
        this.LIST = list;
    }
    @Override
    public void invoke(UserInfo data, AnalysisContext context) {
        data.setUuid(uuid());
        data.setTableName(TABLE_NAME);
        LIST.add(data);
        if (LIST.size() >= BATCH_COUNT) {
            readEasyExeclService.saveDataBatch(LIST);
        }
    }
    @Override
    public void doAfterAllAnalysed(AnalysisContext context) {
        if (!LIST.isEmpty()) {
            readEasyExeclService.saveDataBatch(LIST);
        }
    }
    public static String uuid() {
        return UUID.randomUUID().toString().replace("-", "");
    }
}
@Service
public class ReadEasyExeclServiceImpl implements ReadEasyExeclService {
    @Resource
    private ReadEasyExeclMapper readEasyExeclMapper;
    @Override
    public void saveDataBatch(List<UserInfo> list) {
        readEasyExeclMapper.saveDataBatch(list);
        list.clear();
    }
    private void insertByJdbc(List<UserInfo> list) {
        List<String> sqlList = new ArrayList<>();
        for (UserInfo u : list) {
            StringBuilder sqlBuilder = new StringBuilder();
            sqlBuilder.append("insert into ").append(u.getTableName())
                      .append(" ( UUID,ID,NAME,AGE,ADDRESS,PHONE,OP_TIME ) values ( '")
                      .append(ReadEasyExeclAsyncListener.uuid()).append("','")
                      .append(u.getId()).append("','")
                      .append(u.getName()).append("','")
                      .append(u.getAge()).append("','")
                      .append(u.getAddress()).append("','")
                      .append(u.getPhone()).append("',sysdate)");
            sqlList.add(sqlBuilder.toString());
        }
        JdbcUtil.executeDML(sqlList);
    }
}
@Data
public class UserInfo {
    private String tableName;
    private String uuid;
    @ExcelProperty("ID")
    private String id;
    @ExcelProperty("NAME")
    private String name;
    @ExcelProperty("AGE")
    private String age;
    @ExcelProperty("ADDRESS")
    private String address;
    @ExcelProperty("PHONE")
    private String phone;
}

The above examples demonstrate how to handle massive Excel imports efficiently by combining caching, asynchronous processing, proper thread‑pool tuning, and batch operations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavadatabaseSpring Bootthread poolAsyncExcel
macrozheng
Written by

macrozheng

Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.