Efficiently Reading Large Files in Java without Exhausting Memory
This tutorial demonstrates how to efficiently read large files in Java by avoiding loading all lines into memory, comparing in‑memory approaches with streaming techniques using Scanner and Apache Commons IO, and showing memory usage metrics for each method.
Overview: This tutorial shows how to efficiently read large files in Java.
In‑memory reading: Using Guava's Files.readLines or Apache Commons IO FileUtils.readLines loads all lines into memory, which quickly leads to OutOfMemoryError for large files (e.g., a 1 GB file). Example code is shown, and memory consumption logs demonstrate the issue.
Files.readLines(new File(path), Charsets.UTF_8);
FileUtils.readLines(new File(path));Running a test that reads a 1 GB file initially consumes almost no memory, but after loading all lines memory spikes to about 2 GB, as shown by the log output.
[main] INFO org.baeldung.java.CoreJavaIoUnitTest - Total Memory: 128 Mb
[main] INFO org.baeldung.java.CoreJavaIoUnitTest - Free Memory: 116 Mb
...
[main] INFO org.baeldung.java.CoreJavaIoUnitTest - Total Memory: 2666 Mb
[main] INFO org.baeldung.java.CoreJavaIoUnitTest - Free Memory: 490 MbFile stream approach: Using java.util.Scanner to read the file line by line avoids storing all lines in memory. The code opens a FileInputStream, creates a Scanner, iterates with hasNextLine(), processes each line, and closes resources. Memory usage stays around 150 MB.
FileInputStream inputStream = null;
Scanner sc = null;
try {
inputStream = new FileInputStream(path);
sc = new Scanner(inputStream, "UTF-8");
while (sc.hasNextLine()) {
String line = sc.nextLine();
// process line
}
if (sc.ioException() != null) {
throw sc.ioException();
}
} finally {
if (inputStream != null) inputStream.close();
if (sc != null) sc.close();
}Apache Commons IO stream: The library’s LineIterator provides another streaming solution that also keeps memory consumption low (≈150 MB). Example code and log output are included.
LineIterator it = FileUtils.lineIterator(theFile, "UTF-8");
try {
while (it.hasNext()) {
String line = it.nextLine();
// do something with line
}
} finally {
LineIterator.closeQuietly(it);
}Conclusion: By iterating over file lines instead of loading the entire file into memory, large files can be processed efficiently without exhausting system memory.
Java Captain
Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.