Backend Development 16 min read

How to Export 400GB of Images 45× Faster with Async and Thread‑Pool Optimizations

This article explains why a naïve sequential export of hundreds of gigabytes of images from a database can take many hours, and demonstrates how adding indexes, using asynchronous processing, multithreading, and eliminating unnecessary decryption and local storage can reduce the runtime to minutes.

Su San Talks Tech

Jan 29, 2024

How to Export 400GB of Images 45× Faster with Async and Thread‑Pool Optimizations

In many projects we need to export images stored in a database to the local file system and then deliver them to others.

Typical approach

Use an API or scheduled task.

Read from Oracle or MySQL.

Decrypt Base64 data and write the byte[] to disk with FileOutputStream.

Iterate the local folder and upload each image to a third‑party server via FTP.

The process becomes a bottleneck when the data volume reaches around 400 GB; the export can run for more than 12 hours.

Optimization 1: Add proper indexes

Creating the right indexes on the tables speeds up the query phase dramatically.

Optimization 2: Combine indexes, async and multithreading

By executing the export in parallel threads and using asynchronous methods, the overall runtime drops from hours to minutes.

Optimization 3: Skip decryption and local storage

Instead of decoding Base64 and writing files to disk, stream the raw bytes directly to the FTP server.

After applying these three steps, exporting 40 GB of images was reduced from over 12 hours to about 15 minutes.

Key code: original export method

@Value("${months}")
private String months;

@Value("${imgDir}")
private String imgDir;

@Resource
private UserDao userDao;

@Override
public void getUserInfoImg() {
    try {
        String[] monthArr = months.split(",");
        for (int i = 0; i < monthArr.length; i++) {
            Map<String, Object> map = new HashMap<>();
            String tableName = "USER_INFO_" + monthArr[i];
            map.put("tableName", tableName);
            map.put("status", 1);
            List<UserInfo> userInfoList = userDao.getUserInfoImg(map);
            if (userInfoList == null || userInfoList.isEmpty()) {
                return;
            }
            for (UserInfo user : userInfoList) {
                String userId = user.getUserId();
                String userName = user.getUserName();
                byte[] content = user.getImgContent;
                // download image to local file
                FileUtil.dowmloadImage(imgDir + userId + "-" + userName + ".png", content);
                // upload the local file via FTP
                FileUtil.uploadByFtp(imgDir);
            }
        }
    } catch (Exception e) {
        serviceLogger.error("获取图片异常：", e);
    }
}

Async version (no local file)

@Resource
private UserAsyncService userAsyncService;

@Override
public void getUserInfoImg() {
    try {
        String[] monthArr = months.split(",");
        for (String month : monthArr) {
            userAsyncService.getUserInfoImgAsync(month);
        }
    } catch (Exception e) {
        serviceLogger.error("获取图片异常：", e);
    }
}

Async service method

@Async("async-executor")
@Override
public void getUserInfoImgAsync(String month) {
    try {
        Map<String, Object> map = new HashMap<>();
        String tableName = "USER_INFO_" + month;
        map.put("tableName", tableName);
        map.put("status", 1);
        List<UserInfo> userInfoList = userDao.getUserInfoImg(map);
        if (userInfoList == null || userInfoList.isEmpty()) {
            return;
        }
        for (UserInfo user : userInfoList) {
            byte[] content = user.getImgContent;
            // directly stream to FTP without writing to disk
            FileUtil.uploadByFtp(content);
        }
    } catch (Exception e) {
        serviceLogger.error("获取图片异常：", e);
    }
}

Custom thread‑pool configuration

@EnableAsync
@Configuration
public class AsyncTaskConfig {
    @Bean("my-executor")
    public Executor firstExecutor() {
        ThreadFactory threadFactory = new ThreadFactoryBuilder()
            .setNameFormat("my-executor")
            .build();
        int curSystemThreads = Runtime.getRuntime().availableProcessors() * 2;
        ThreadPoolExecutor threadPool = new ThreadPoolExecutor(
            curSystemThreads, 100, 200, TimeUnit.SECONDS,
            new LinkedBlockingQueue<>(), threadFactory);
        threadPool.allowsCoreThreadTimeOut();
        return threadPool;
    }

    @Bean("async-executor")
    public Executor asyncExecutor() {
        ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
        taskExecutor.setCorePoolSize(24);
        taskExecutor.setMaxPoolSize(200);
        taskExecutor.setQueueCapacity(50);
        taskExecutor.setKeepAliveSeconds(200);
        taskExecutor.setThreadNamePrefix("async-executor-");
        taskExecutor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
        taskExecutor.initialize();
        return taskExecutor;
    }
}

Parallel Stream example

public static void main(String[] args) {
    ExecutorService executor = Executors.newFixedThreadPool(10);
    CompletableFuture<Integer> future = CompletableFuture.supplyAsync(() -> {
        // some hidden operation
        return "result";
    }, executor);
    CompletableFuture<String> result = future.thenApply(r -> {
        // another hidden operation
        return "result";
    });
    String finalResult = result.join();
    executor.shutdown();
}

When to use Stream.parallel()

Parallel streams are beneficial for large data sets (e.g., >10,000 elements), CPU‑intensive transformations, and when the hardware provides multiple cores. For small collections or simple calculations, the overhead may outweigh the gains.

Algorithmic considerations

Removing unnecessary Base64 decoding, avoiding intermediate file writes, and choosing efficient data structures (e.g., ConcurrentHashMap over Hashtable ) are examples of algorithmic optimizations that dramatically improve performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Performance ThreadPool Async

Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.