Backend Development 5 min read

Efficiently Reading Large Files in Java without Exhausting Memory

This tutorial demonstrates how to efficiently read large files in Java by avoiding loading all lines into memory, comparing in‑memory approaches with streaming techniques using Scanner and Apache Commons IO, and showing memory usage metrics for each method.

Java Captain
Java Captain
Java Captain
Efficiently Reading Large Files in Java without Exhausting Memory

Overview: This tutorial shows how to efficiently read large files in Java.

In‑memory reading: Using Guava's Files.readLines or Apache Commons IO FileUtils.readLines loads all lines into memory, which quickly leads to OutOfMemoryError for large files (e.g., a 1 GB file). Example code is shown, and memory consumption logs demonstrate the issue.

Files.readLines(new File(path), Charsets.UTF_8);
FileUtils.readLines(new File(path));

Running a test that reads a 1 GB file initially consumes almost no memory, but after loading all lines memory spikes to about 2 GB, as shown by the log output.

[main] INFO org.baeldung.java.CoreJavaIoUnitTest - Total Memory: 128 Mb
[main] INFO org.baeldung.java.CoreJavaIoUnitTest - Free Memory: 116 Mb
...
[main] INFO org.baeldung.java.CoreJavaIoUnitTest - Total Memory: 2666 Mb
[main] INFO org.baeldung.java.CoreJavaIoUnitTest - Free Memory: 490 Mb

File stream approach: Using java.util.Scanner to read the file line by line avoids storing all lines in memory. The code opens a FileInputStream, creates a Scanner, iterates with hasNextLine(), processes each line, and closes resources. Memory usage stays around 150 MB.

FileInputStream inputStream = null;
Scanner sc = null;
try {
    inputStream = new FileInputStream(path);
    sc = new Scanner(inputStream, "UTF-8");
    while (sc.hasNextLine()) {
        String line = sc.nextLine();
        // process line
    }
    if (sc.ioException() != null) {
        throw sc.ioException();
    }
} finally {
    if (inputStream != null) inputStream.close();
    if (sc != null) sc.close();
}

Apache Commons IO stream: The library’s LineIterator provides another streaming solution that also keeps memory consumption low (≈150 MB). Example code and log output are included.

LineIterator it = FileUtils.lineIterator(theFile, "UTF-8");
try {
    while (it.hasNext()) {
        String line = it.nextLine();
        // do something with line
    }
} finally {
    LineIterator.closeQuietly(it);
}

Conclusion: By iterating over file lines instead of loading the entire file into memory, large files can be processed efficiently without exhausting system memory.

JavaMemory ManagementFile I/OLarge FilesScannerApache Commons IO
Java Captain
Written by

Java Captain

Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.