Optimizing Java File Compression: From Buffered Streams to NIO Channels and Memory‑Mapped Files
This article demonstrates how to improve Java file compression performance by replacing unbuffered FileInputStream with BufferedInputStream, then leveraging NIO Channels, transferTo, memory‑mapped files, and Pipe, showing step‑by‑step code examples and timing results that reduce processing time from 30 seconds to about 1 second.
The original implementation compressed ten 2 MB images using a plain FileInputStream inside a loop, which required a native read call for each byte and took about 30 seconds for a 20 MB archive.
First optimization – Buffered streams
By wrapping the FileInputStream with a BufferedInputStream (default 8 KB buffer) the number of native calls drops dramatically, reducing the total time to roughly 2 seconds.
public static void zipFileBuffer() {
File zipFile = new File(ZIP_FILE);
try (ZipOutputStream zipOut = new ZipOutputStream(new FileOutputStream(zipFile));
BufferedOutputStream bufferedOut = new BufferedOutputStream(zipOut)) {
long beginTime = System.currentTimeMillis();
for (int i = 0; i < 10; i++) {
try (BufferedInputStream bis = new BufferedInputStream(new FileInputStream(JPG_FILE))) {
zipOut.putNextEntry(new ZipEntry(FILE_NAME + i));
int temp;
while ((temp = bis.read()) != -1) {
bufferedOut.write(temp);
}
}
}
printInfo(beginTime);
} catch (Exception e) {
e.printStackTrace();
}
}Second optimization – NIO Channel and transferTo
Using a FileChannel and the transferTo method lets the kernel move bytes directly between channels without copying them into user space, cutting the time further to about 1.4 seconds.
public static void zipFileChannel() {
File zipFile = new File(ZIP_FILE);
try (ZipOutputStream zipOut = new ZipOutputStream(new FileOutputStream(zipFile));
WritableByteChannel out = Channels.newChannel(zipOut)) {
long beginTime = System.currentTimeMillis();
for (int i = 0; i < 10; i++) {
zipOut.putNextEntry(new ZipEntry(i + SUFFIX_FILE));
try (FileChannel fileChannel = new FileInputStream(JPG_FILE).getChannel()) {
fileChannel.transferTo(0, FILE_SIZE, out);
}
}
printInfo(beginTime);
} catch (Exception e) {
e.printStackTrace();
}
}Third optimization – Memory‑mapped file
A MappedByteBuffer maps the file directly into memory, providing the same performance as the channel approach (≈1.3 seconds).
public static void zipFileMap() {
File zipFile = new File(ZIP_FILE);
try (ZipOutputStream zipOut = new ZipOutputStream(new FileOutputStream(zipFile));
WritableByteChannel out = Channels.newChannel(zipOut)) {
long beginTime = System.currentTimeMillis();
for (int i = 0; i < 10; i++) {
zipOut.putNextEntry(new ZipEntry(i + SUFFIX_FILE));
MappedByteBuffer mapped = new RandomAccessFile(JPG_FILE_PATH, "r").getChannel()
.map(FileChannel.MapMode.READ_ONLY, 0, FILE_SIZE);
out.write(mapped);
}
printInfo(beginTime);
} catch (Exception e) {
e.printStackTrace();
}
}Fourth optimization – Pipe with asynchronous task
Using a Pipe separates reading and writing into two threads; the writer reads from the source channel and writes into the sink channel, while the main thread streams the data into the zip output. This approach also stays around the 1‑second mark.
public static void zipFilePipe() {
long beginTime = System.currentTimeMillis();
try (WritableByteChannel out = Channels.newChannel(new FileOutputStream(ZIP_FILE))) {
Pipe pipe = Pipe.open();
CompletableFuture.runAsync(() -> runTask(pipe));
ReadableByteChannel source = pipe.source();
ByteBuffer buffer = ByteBuffer.allocate((int) FILE_SIZE * 10);
while (source.read(buffer) >= 0) {
buffer.flip();
out.write(buffer);
buffer.clear();
}
printInfo(beginTime);
} catch (Exception e) {
e.printStackTrace();
}
}
private static void runTask(Pipe pipe) {
try (ZipOutputStream zos = new ZipOutputStream(Channels.newOutputStream(pipe.sink()))) {
for (int i = 0; i < 10; i++) {
zos.putNextEntry(new ZipEntry(i + SUFFIX_FILE));
try (FileChannel jpgChannel = new FileInputStream(JPG_FILE_PATH).getChannel()) {
jpgChannel.transferTo(0, FILE_SIZE, Channels.newChannel(zos));
}
}
} catch (Exception e) {
e.printStackTrace();
}
}The article also explains why kernel‑space to user‑space copying is costly, the role of system calls, and the trade‑offs of direct versus non‑direct buffers (security, GC pressure, and write‑back timing).
Conclusion
Even a simple change from unbuffered streams to NIO‑based techniques can shrink a 20 MB compression task from 30 seconds to about 1 second, illustrating the importance of understanding Java I/O internals and applying the right abstraction for performance‑critical code.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.