How I Slashed PHP 1B‑Row Processing Time from 25 min to 27 s
In this article the author walks through the 1 billion‑row (1brc) challenge, shows a naïve PHP implementation, profiles its bottlenecks, and incrementally applies optimizations—switching from fgetcsv to fgets, using references, adding type casts, enabling JIT and parallel processing—to reduce runtime from 25 minutes to under 30 seconds.
Today I dive into the "1 billion‑row" (1brc) challenge on GitHub and attempt to solve it with PHP, even though PHP is not known for raw speed.
Naïve implementation
I cloned the repository, downloaded measurements.txt (a CSV file), and wrote the simplest possible parser using fgetcsv() to build an associative array of stations and compute min, max, and average temperatures.
<?php
$stations = [];
$fp = fopen('measurements.txt', 'r');
while ($data = fgetcsv($fp, null, ';')) {
if (!isset($stations[$data[0]])) {
$stations[$data[0]] = [$data[1], $data[1], $data[1], 1];
} else {
$stations[$data[0]][3]++;
$stations[$data[0]][2] += $data[1];
if ($data[1] < $stations[$data[0]][0]) $stations[$data[0]][0] = $data[1];
if ($data[1] > $stations[$data[0]][1]) $stations[$data[0]][1] = $data[1];
}
}
ksort($stations);
echo '{';
foreach($stations as $k=>&$station) {
$station[2] = $station[2]/$station[3];
echo $k, '=', $station[0], '/', $station[2], '/', $station[1], ', ';
}
echo '}';This version took about 25 minutes on my laptop.
Replacing fgetcsv() with fgets()
Reading each line manually and splitting on the semicolon removed the overhead of fgetcsv(). The core change looks like:
// ...
while ($data = fgets($fp, 999)) {
$pos = strpos($data, ';');
$city = substr($data, 0, $pos);
$temp = substr($data, $pos+1, -1);
// ...
}Runtime dropped to 19 minutes 49 seconds (≈21 % faster).
Using references
Assigning the station entry to a reference avoids repeated hash look‑ups:
$station = &$stations[$city];
$station[3]++;
$station[2] += $temp;This cut another 10 % (≈17 minutes 48 seconds).
Adding type casting
Explicitly casting the temperature string to (float) let the engine skip runtime type checks, bringing the total down to 13 minutes 32 seconds (≈21 % improvement).
Enabling JIT
PHP’s OPCache JIT is disabled for CLI by default. Enabling it and allocating a 10 MiB JIT buffer reduced the runtime to 7 minutes 19 seconds, a 45.9 % gain.
Parallel processing with ext-parallel
To exploit multiple cores, the file is split into $threads_cnt chunks aligned on newline boundaries. Each thread reads its chunk with fgets(), processes the data, and returns a partial result. The main thread merges, sorts, and prints the final output.
<?php
$file = 'measurements.txt';
$threads_cnt = 16;
// Functions get_file_chunks() and $process_chunk omitted for brevity
$chunks = get_file_chunks($file, $threads_cnt);
$futures = [];
for ($i = 0; $i < $threads_cnt; $i++) {
$runtime = new \parallel\Runtime();
$futures[$i] = $runtime->run($process_chunk, [$file, $chunks[$i][0], $chunks[$i][1]]);
}
$results = [];
for ($i = 0; $i < $threads_cnt; $i++) {
$chunk_result = $futures[$i]->value();
foreach ($chunk_result as $city => $measurement) {
if (isset($results[$city])) {
$result = &$results[$city];
$result[2] += $measurement[2];
$result[3] += $measurement[3];
if ($measurement[0] < $result[0]) $result[0] = $measurement[0];
if ($measurement[1] > $result[1]) $result[1] = $measurement[1];
} else {
$results[$city] = $measurement;
}
}
}
ksort($results);
echo "{
";
foreach($results as $k=>&$station) {
echo "\t$k=", $station[0], '/', ($station[2]/$station[3]), '/', $station[1], ",
";
}
echo "}
";With 16 threads the whole pipeline finishes in about 1 minute 35 seconds.
Final results and take‑aways
After compiling PHP 8.3 with optimized CFLAGS and running the 16‑thread version, the runtime dropped to an astonishing 27.7 seconds.
Key lessons:
High‑level helpers like fgetcsv() can hide costly work; low‑level line reading gives more control.
Using references and explicit type casts reduces hash look‑ups and conversion overhead.
Enabling JIT provides a massive boost for CPU‑bound code.
Parallelism via ext‑parallel can turn a multi‑minute job into a sub‑minute one.
Images illustrating the profiling results are omitted for brevity.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
