Fundamentals 7 min read

Performance Comparison of Java forEach, C‑Style Loop, and Stream API

This article examines Java's forEach syntax, C‑style for loops, and Stream API by benchmarking their execution times on large collections, explaining the underlying mechanisms that cause performance differences and identifying the most efficient traversal method for Sets.

Java Captain
Java Captain
Java Captain
Performance Comparison of Java forEach, C‑Style Loop, and Stream API

The article explores Java's forEach syntax and compares it with the traditional C‑style for loop and the Stream API, especially when iterating over large collections where performance becomes critical.

Java developers frequently use containers such as ArrayList and HashSet . Since Java 8, lambda expressions and the streaming API simplify container handling, but when processing millions of elements the overhead of different iteration styles becomes noticeable. The author uses JMH to measure the execution time of each approach.

Example implementations include:

public List streamSingleThread(BenchMarkState state){ List result = new ArrayList<>(state.testData.size()); state.testData.stream().forEach(item -> { result.add(item); }); return result; } public List streamMultiThread(BenchMarkState state){ List result = new ArrayList<>(state.testData.size()); state.testData.stream().parallel().forEach(item -> { result.add(item); }); return result; }

A classic forEach loop written with an enhanced for‑each statement:

public List forEach(BenchMarkState state){ List result = new ArrayList<>(state.testData.size()); for(Integer item : state.testData){ result.add(item); } return result; }

The C‑style loop version:

public List forCStyle(BenchMarkState state){ int size = state.testData.size(); List result = new ArrayList<>(size); for(int j = 0; j < size; j++){ result.add(state.testData.get(j)); } return result; }

Benchmark results (average time per operation):

Benchmark Mode Cnt Score Error Units TestLoopPerformance.forCStyle avgt 200 18.068 ± 0.074 ms/op TestLoopPerformance.forEach avgt 200 30.566 ± 0.165 ms/op TestLoopPerformance.streamMultiThread avgt 200 79.433 ± 0.747 ms/op TestLoopPerformance.streamSingleThread avgt 200 37.779 ± 0.485 ms/op

The slower performance of forEach is explained by the JVM converting the construct into an iterator and invoking hasNext() for each element, as described in a StackOverflow answer and Oracle's documentation.

To find the most efficient way to traverse a Set , the author defines a benchmark state with 500,000 integers and tests several strategies, including converting the set to an array, using an iterator together with a C‑style loop, and a simple enhanced for‑each loop. The benchmark state is defined as:

@State(Scope.Benchmark) public static class BenchMarkState { @Setup(Level.Trial) public void doSetup() { for(int i = 0; i < 500000; i++){ testData.add(Integer.valueOf(i)); } } @TearDown(Level.Trial) public void doTearDown() { testData = new HashSet<>(500000); } public Set testData = new HashSet<>(500000); }

Additional traversal implementations include converting the set to an array before looping:

public List forCStyle(BenchMarkState state){ int size = state.testData.size(); List result = new ArrayList<>(size); Integer[] temp = (Integer[]) state.testData.toArray(new Integer[size]); for(int j = 0; j < size; j++){ result.add(temp[j]); } return result; }

Combining an iterator with a C‑style loop:

public List forCStyleWithIteration(BenchMarkState state){ int size = state.testData.size(); List result = new ArrayList<>(size); Iterator iteration = state.testData.iterator(); for(int j = 0; j < size; j++){ result.add(iteration.next()); } return result; }

And a simple enhanced for‑each loop:

public List forEach(BenchMarkState state){ List result = new ArrayList<>(state.testData.size()); for(Integer item : state.testData) { result.add(item); } return result; }

Final benchmark results for the Set traversals show that the C‑style loop combined with an iterator is the fastest (≈4.28 ms/op), followed closely by the plain enhanced for‑each loop (≈4.50 ms/op), while the traditional C‑style loop without iterator is slower (≈6.01 ms/op):

Benchmark Mode Cnt Score Error Units TestLoopPerformance.forCStyle avgt 200 6.013 ± 0.108 ms/op TestLoopPerformance.forCStyleWithIteration avgt 200 4.281 ± 0.049 ms/op TestLoopPerformance.forEach avgt 200 4.498 ± 0.026 ms/op

The conclusion states that while foreach and the Stream API are convenient for handling collections, performance‑critical systems may benefit from hand‑written loops, especially when iterating over hash‑based collections.

JavaperformancestreambenchmarkingforEachLoop
Java Captain
Written by

Java Captain

Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.