How I Slashed PHP 1B‑Row Processing Time from 25 min to 27 s

In this article the author walks through the 1 billion‑row (1brc) challenge, shows a naïve PHP implementation, profiles its bottlenecks, and incrementally applies optimizations—switching from fgetcsv to fgets, using references, adding type casts, enabling JIT and parallel processing—to reduce runtime from 25 minutes to under 30 seconds.

21CTO
21CTO
21CTO
How I Slashed PHP 1B‑Row Processing Time from 25 min to 27 s

Today I dive into the "1 billion‑row" (1brc) challenge on GitHub and attempt to solve it with PHP, even though PHP is not known for raw speed.

Naïve implementation

I cloned the repository, downloaded measurements.txt (a CSV file), and wrote the simplest possible parser using fgetcsv() to build an associative array of stations and compute min, max, and average temperatures.

<?php
$stations = [];
$fp = fopen('measurements.txt', 'r');
while ($data = fgetcsv($fp, null, ';')) {
    if (!isset($stations[$data[0]])) {
        $stations[$data[0]] = [$data[1], $data[1], $data[1], 1];
    } else {
        $stations[$data[0]][3]++;
        $stations[$data[0]][2] += $data[1];
        if ($data[1] < $stations[$data[0]][0]) $stations[$data[0]][0] = $data[1];
        if ($data[1] > $stations[$data[0]][1]) $stations[$data[0]][1] = $data[1];
    }
}
ksort($stations);
echo '{';
foreach($stations as $k=>&$station) {
    $station[2] = $station[2]/$station[3];
    echo $k, '=', $station[0], '/', $station[2], '/', $station[1], ', ';
}
echo '}';

This version took about 25 minutes on my laptop.

Replacing fgetcsv() with fgets()

Reading each line manually and splitting on the semicolon removed the overhead of fgetcsv(). The core change looks like:

// ...
while ($data = fgets($fp, 999)) {
    $pos = strpos($data, ';');
    $city = substr($data, 0, $pos);
    $temp = substr($data, $pos+1, -1);
    // ...
}

Runtime dropped to 19 minutes 49 seconds (≈21 % faster).

Using references

Assigning the station entry to a reference avoids repeated hash look‑ups:

$station = &$stations[$city];
$station[3]++;
$station[2] += $temp;

This cut another 10 % (≈17 minutes 48 seconds).

Adding type casting

Explicitly casting the temperature string to (float) let the engine skip runtime type checks, bringing the total down to 13 minutes 32 seconds (≈21 % improvement).

Enabling JIT

PHP’s OPCache JIT is disabled for CLI by default. Enabling it and allocating a 10 MiB JIT buffer reduced the runtime to 7 minutes 19 seconds, a 45.9 % gain.

Parallel processing with ext-parallel

To exploit multiple cores, the file is split into $threads_cnt chunks aligned on newline boundaries. Each thread reads its chunk with fgets(), processes the data, and returns a partial result. The main thread merges, sorts, and prints the final output.

<?php
$file = 'measurements.txt';
$threads_cnt = 16;
// Functions get_file_chunks() and $process_chunk omitted for brevity
$chunks = get_file_chunks($file, $threads_cnt);
$futures = [];
for ($i = 0; $i < $threads_cnt; $i++) {
    $runtime = new \parallel\Runtime();
    $futures[$i] = $runtime->run($process_chunk, [$file, $chunks[$i][0], $chunks[$i][1]]);
}
$results = [];
for ($i = 0; $i < $threads_cnt; $i++) {
    $chunk_result = $futures[$i]->value();
    foreach ($chunk_result as $city => $measurement) {
        if (isset($results[$city])) {
            $result = &$results[$city];
            $result[2] += $measurement[2];
            $result[3] += $measurement[3];
            if ($measurement[0] < $result[0]) $result[0] = $measurement[0];
            if ($measurement[1] > $result[1]) $result[1] = $measurement[1];
        } else {
            $results[$city] = $measurement;
        }
    }
}
ksort($results);
echo "{
";
foreach($results as $k=>&$station) {
    echo "\t$k=", $station[0], '/', ($station[2]/$station[3]), '/', $station[1], ",
";
}
echo "}
";

With 16 threads the whole pipeline finishes in about 1 minute 35 seconds.

Final results and take‑aways

After compiling PHP 8.3 with optimized CFLAGS and running the 16‑thread version, the runtime dropped to an astonishing 27.7 seconds.

Key lessons:

High‑level helpers like fgetcsv() can hide costly work; low‑level line reading gives more control.

Using references and explicit type casts reduces hash look‑ups and conversion overhead.

Enabling JIT provides a massive boost for CPU‑bound code.

Parallelism via ext‑parallel can turn a multi‑minute job into a sub‑minute one.

Images illustrating the profiling results are omitted for brevity.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance OptimizationJITPHPparallel processing1B Row Challenge
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.