Can You Beat the One Billion Row Challenge? Inside Java Performance Secrets
This article explores the One Billion Row Challenge, a Java benchmark that requires parsing a 13 GB file of one billion temperature records, and walks through baseline code, top‑ranked solutions, and a step‑by‑step performance tuning journey that reduces execution time from minutes to under two seconds.
What is the One Billion Row Challenge?
The challenge, published by Gunnar Morling on January 1 2024, asks participants to read a 13 GB text file containing one billion lines, each line holding a weather‑station name and a temperature value (one decimal place). For every station the minimum, maximum and average temperature must be computed and the results output in dictionary order.
Baseline Java solution
The reference implementation creates a MeasurementAggregator that stores the minimum, maximum, sum and count for each station. It reads the file line‑by‑line, splits each line on the semicolon, parses the temperature as a double, updates the aggregator and finally writes the results using a TreeMap to keep dictionary order.
The baseline runs in under two minutes on a high‑end benchmark server (32‑core AMD EPYC 7502P, 128 GB RAM), but on a typical laptop it takes about 14 minutes.
Top‑ranked solutions and their evolution
Version 0 – Switching JVM
Simply running the same code on GraalVM instead of OpenJDK reduces the runtime from 71 s to 66 s, a modest 5‑second gain.
Version 1 – Parallel I/O
The first real speed‑up uses Java parallel streams to read and process the file concurrently, fully utilizing all CPU cores. On a Hetzner AX161 server (32 cores) the execution time drops to 71 seconds.
Version 2 – Faster temperature parsing
Parsing the temperature as an int directly from the byte buffer avoids the overhead of Double.parseDouble. The custom method extracts the sign, integer and fractional digits in a few integer operations.
private int parseTemperature(long semicolonPos) {
long off = semicolonPos + 1;
int sign = 1;
byte b = chunk.get(JAVA_BYTE, off++);
if (b == '-') {
sign = -1;
b = chunk.get(JAVA_BYTE, off++);
}
int temp = b - '0';
b = chunk.get(JAVA_BYTE, off++);
if (b != '.') {
temp = 10 * temp + b - '0';
// skip the decimal point
off++;
}
b = chunk.get(JAVA_BYTE, off);
temp = 10 * temp + b - '0';
return sign * temp;
}This change cuts the runtime by another 6 seconds, bringing it down to 11 seconds.
Version 3 – Custom hash table
Because the number of distinct stations (≈ 413) is known, the author replaces the generic HashMap with an open‑addressing hash table tailored to the fixed key size. The new findAcc routine directly computes a hash, probes for collisions, and stores StationStats objects without the overhead of Java’s map abstractions.
After this optimisation the execution time falls to 6.6 seconds.
Version 4 – Unsafe and SWAR
The fourth iteration drops the safe Java APIs in favour of sun.misc.Unsafe to read memory without bounds checks and applies SWAR (SIMD‑within‑a‑register) techniques to locate semicolons and parse temperatures eight bytes at a time. The code also reuses loaded bytes for hashing, eliminating redundant reads.
These low‑level tricks push the runtime down to 2.4 seconds.
Version 5 – Statistics‑driven tweaks
Profiling revealed that half of the station names are ≤ 8 bytes, causing frequent branch mispredictions in the nameEquals method. A small helper program analyses name‑length distribution and shows that moving the length check to > 16 bytes reduces misprediction from 50 % to 2.5 %.
The author then rewrites the comparison routine to avoid the if when possible, shaving another 0.1 seconds.
Final results
Combining all the above optimisations, the author’s implementation processes the 13 GB, one‑billion‑line dataset in just 1.7 seconds on the same benchmark server, a 45 % improvement over the previous best OpenJDK result and a dramatic speed‑up compared with the initial 14‑minute run on a regular laptop.
All source code referenced in the article is publicly available on GitHub:
Baseline: https://github.com/gunnarmorling/1brc/blob/main/src/main/java/dev/morling/onebrc/CalculateAverage_baseline.java
Top‑ranked solutions: https://github.com/gunnarmorling/1brc, https://github.com/mtopolnik/billion-row-challenge/blob/main/src/Blog1.java, https://github.com/mtopolnik/billion-row-challenge/blob/main/src/Blog2.java, https://github.com/mtopolnik/billion-row-challenge/blob/main/src/Blog3.java, https://github.com/mtopolnik/billion-row-challenge/blob/main/src/Blog4.java, https://github.com/mtopolnik/billion-row-challenge/blob/main/src/Blog5.java
The article also includes profiling links, flame‑graph images and a brief discussion on the trade‑off between readability and raw performance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
