How to Export 400GB of Images 45× Faster with Async and Thread‑Pool Optimizations
This article explains why a naïve sequential export of hundreds of gigabytes of images from a database can take many hours, and demonstrates how adding indexes, using asynchronous processing, multithreading, and eliminating unnecessary decryption and local storage can reduce the runtime to minutes.
In many projects we need to export images stored in a database to the local file system and then deliver them to others.
Typical approach
Use an API or scheduled task.
Read from Oracle or MySQL.
Decrypt Base64 data and write the byte[] to disk with FileOutputStream.
Iterate the local folder and upload each image to a third‑party server via FTP.
The process becomes a bottleneck when the data volume reaches around 400 GB; the export can run for more than 12 hours.
Optimization 1: Add proper indexes
Creating the right indexes on the tables speeds up the query phase dramatically.
Optimization 2: Combine indexes, async and multithreading
By executing the export in parallel threads and using asynchronous methods, the overall runtime drops from hours to minutes.
Optimization 3: Skip decryption and local storage
Instead of decoding Base64 and writing files to disk, stream the raw bytes directly to the FTP server.
After applying these three steps, exporting 40 GB of images was reduced from over 12 hours to about 15 minutes.
Key code: original export method
@Value("${months}")
private String months;
@Value("${imgDir}")
private String imgDir;
@Resource
private UserDao userDao;
@Override
public void getUserInfoImg() {
try {
String[] monthArr = months.split(",");
for (int i = 0; i < monthArr.length; i++) {
Map<String, Object> map = new HashMap<>();
String tableName = "USER_INFO_" + monthArr[i];
map.put("tableName", tableName);
map.put("status", 1);
List<UserInfo> userInfoList = userDao.getUserInfoImg(map);
if (userInfoList == null || userInfoList.isEmpty()) {
return;
}
for (UserInfo user : userInfoList) {
String userId = user.getUserId();
String userName = user.getUserName();
byte[] content = user.getImgContent;
// download image to local file
FileUtil.dowmloadImage(imgDir + userId + "-" + userName + ".png", content);
// upload the local file via FTP
FileUtil.uploadByFtp(imgDir);
}
}
} catch (Exception e) {
serviceLogger.error("获取图片异常:", e);
}
}Async version (no local file)
@Resource
private UserAsyncService userAsyncService;
@Override
public void getUserInfoImg() {
try {
String[] monthArr = months.split(",");
for (String month : monthArr) {
userAsyncService.getUserInfoImgAsync(month);
}
} catch (Exception e) {
serviceLogger.error("获取图片异常:", e);
}
}Async service method
@Async("async-executor")
@Override
public void getUserInfoImgAsync(String month) {
try {
Map<String, Object> map = new HashMap<>();
String tableName = "USER_INFO_" + month;
map.put("tableName", tableName);
map.put("status", 1);
List<UserInfo> userInfoList = userDao.getUserInfoImg(map);
if (userInfoList == null || userInfoList.isEmpty()) {
return;
}
for (UserInfo user : userInfoList) {
byte[] content = user.getImgContent;
// directly stream to FTP without writing to disk
FileUtil.uploadByFtp(content);
}
} catch (Exception e) {
serviceLogger.error("获取图片异常:", e);
}
}Custom thread‑pool configuration
@EnableAsync
@Configuration
public class AsyncTaskConfig {
@Bean("my-executor")
public Executor firstExecutor() {
ThreadFactory threadFactory = new ThreadFactoryBuilder()
.setNameFormat("my-executor")
.build();
int curSystemThreads = Runtime.getRuntime().availableProcessors() * 2;
ThreadPoolExecutor threadPool = new ThreadPoolExecutor(
curSystemThreads, 100, 200, TimeUnit.SECONDS,
new LinkedBlockingQueue<>(), threadFactory);
threadPool.allowsCoreThreadTimeOut();
return threadPool;
}
@Bean("async-executor")
public Executor asyncExecutor() {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setCorePoolSize(24);
taskExecutor.setMaxPoolSize(200);
taskExecutor.setQueueCapacity(50);
taskExecutor.setKeepAliveSeconds(200);
taskExecutor.setThreadNamePrefix("async-executor-");
taskExecutor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
taskExecutor.initialize();
return taskExecutor;
}
}Parallel Stream example
public static void main(String[] args) {
ExecutorService executor = Executors.newFixedThreadPool(10);
CompletableFuture<Integer> future = CompletableFuture.supplyAsync(() -> {
// some hidden operation
return "result";
}, executor);
CompletableFuture<String> result = future.thenApply(r -> {
// another hidden operation
return "result";
});
String finalResult = result.join();
executor.shutdown();
}When to use Stream.parallel()
Parallel streams are beneficial for large data sets (e.g., >10,000 elements), CPU‑intensive transformations, and when the hardware provides multiple cores. For small collections or simple calculations, the overhead may outweigh the gains.
Algorithmic considerations
Removing unnecessary Base64 decoding, avoiding intermediate file writes, and choosing efficient data structures (e.g., ConcurrentHashMap over Hashtable ) are examples of algorithmic optimizations that dramatically improve performance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Su San Talks Tech
Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
