Big Data 20 min read

How to Crush the One Billion Row Java Challenge: From 14 Minutes to Sub‑2‑Second Runtime

This article walks through the One Billion Row Challenge, explaining the problem, baseline solution, and a series of performance optimizations—from JVM selection and parallel I/O to custom hash tables, unsafe memory access, and SIMD techniques—that shrink execution time from minutes to under two seconds.

Su San Talks Tech
Su San Talks Tech
Su San Talks Tech
How to Crush the One Billion Row Java Challenge: From 14 Minutes to Sub‑2‑Second Runtime

Challenge Overview

On January 1, 2024, Gunnar Morling announced the "One Billion Row Challenge" (1BRC), asking participants to parse a 13 GB file containing weather‑station temperature records and compute the minimum, maximum, and average temperature for each station, outputting results in dictionary order.

Each line in the file follows the format station;temperature, where temperature has one decimal place.

https://www.morling.dev/blog/one-billion-row-challenge/

Baseline Solution

The reference implementation uses a MeasurementAggregator to track min, max, sum, and count per station. It reads the file line‑by‑line with a BufferedReader, splits each line on the semicolon, parses the temperature as a double, and stores the data in a TreeMap. On a high‑end server the baseline finishes in about 2 minutes; on a modest machine it takes roughly 14 minutes.

https://github.com/gunnarmorling/1brc/blob/main/src/main/java/dev/morling/onebrc/CalculateAverage_baseline.java

First‑Place Code

The top solution (by a GraalVM contributor) is highly optimized and difficult to read. It employs custom parsing, low‑level memory access, and aggressive inlining to achieve sub‑2‑second runtimes.

https://github.com/gunnarmorling/1brc/blob/main/src/main/java/dev/morling/onebrc/CalculateAverage_thomaswue.java

Optimization Journey

0️⃣ Switch to GraalVM

Running the baseline on GraalVM reduces execution from 71 s to 66 s by eliminating JVM startup overhead.

1️⃣ Parallel I/O

File is split into chunks equal to the number of CPU cores; each chunk is processed by a separate thread, avoiding the bottleneck of sequential BufferedReader reads.

2️⃣ Faster Temperature Parsing

private int parseTemperature(long semicolonPos) {<br/>    long off = semicolonPos + 1;<br/>    int sign = 1;<br/>    byte b = chunk.get(JAVA_BYTE, off++);<br/>    if (b == '-') { sign = -1; b = chunk.get(JAVA_BYTE, off++); }<br/>    int temp = b - '0';<br/>    b = chunk.get(JAVA_BYTE, off++);<br/>    if (b != '.') { temp = 10 * temp + b - '0'; off++; }<br/>    b = chunk.get(JAVA_BYTE, off);<br/>    temp = 10 * temp + b - '0';<br/>    return sign * temp;<br/>}

This change cuts the runtime from 66 s to 11 s.

3️⃣ Custom Hash Table

Because only about 413 stations exist, a handcrafted open‑addressing hash table replaces HashMap, reducing allocation and lookup costs. Runtime drops to 6.6 s.

4️⃣ Unsafe & SWAR

Using sun.misc.Unsafe and SIMD‑within‑a‑register (SWAR) techniques eliminates bounds checks and processes eight bytes per iteration. This pushes the runtime down to 2.4 s.

5️⃣ Statistical‑Driven Branch Prediction

Analyzing station‑name length distribution shows roughly half are ≤ 8 bytes, causing branch‑prediction failures. Adjusting the length check threshold to > 16 reduces mispredictions, shaving another 0.6 s to reach 1.8 s.

Final Tweaks

Further micro‑optimizations—such as removing startup/cleanup costs and using smaller work‑stealing chunks—bring the final runtime to about 1.7 s.

Takeaways

The challenge demonstrates how low‑level memory handling, custom data structures, and careful profiling can transform a data‑processing task from minutes to seconds. While the code becomes harder to read, the techniques are valuable for any performance‑critical Java backend dealing with massive datasets.

JavaPerformanceOptimizationbig dataOne Billion Row Challenge
Su San Talks Tech
Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.