How I Cut a 1‑Billion‑Row PHP Parser from 25 Minutes to 27 Seconds
This article walks through a step‑by‑step performance engineering journey for the 1 billion‑row challenge in PHP, starting with a naïve CSV parser, replacing fgetcsv with fgets, applying reference operators, type casting, enabling JIT, and finally using multithreaded parallel processing to shrink runtime from 25 minutes to under 30 seconds.
Overview
The 1 billion‑row challenge (1brc) on GitHub (https://github.com/gunnarmorling/1brc) requires computing the minimum, maximum and average temperature for each city from a 13 GB CSV file. This summary describes a step‑by‑step optimisation of a PHP script that solves the problem.
Naïve implementation
The initial version reads the file with fgetcsv(), stores per‑city data in an associative array $stations (min, max, sum, count) and sorts the result with ksort(). On a typical laptop the runtime is about 25 minutes .
<?php
$stations = [];
$fp = fopen('measurements.txt', 'r');
while ($data = fgetcsv($fp, null, ';')) {
if (!isset($stations[$data[0]])) {
$stations[$data[0]] = [$data[1], $data[1], $data[1], 1];
} else {
$stations[$data[0]][3]++;
$stations[$data[0]][2] += $data[1];
if ($data[1] < $stations[$data[0]][0]) $stations[$data[0]][0] = $data[1];
if ($data[1] > $stations[$data[0]][1]) $stations[$data[0]][1] = $data[1];
}
}
ksort($stations);
echo '{';
foreach($stations as $k=&$station) {
$station[2] = $station[2]/$station[3];
echo $k, '=', $station[0], '/', $station[2], '/', $station[1], ', ';
}
echo '}';
?>Replace fgetcsv() with fgets()
Reading each line with fgets() and splitting on the semicolon removes the CSV parser overhead.
while ($data = fgets($fp, 999)) {
$pos = strpos($data, ';');
$city = substr($data, 0, $pos);
$temp = substr($data, $pos + 1, -1);
// … update $stations …
}This change reduces the runtime to 19 minutes 49 seconds (≈21 % faster).
Use reference operator
Assigning a reference to the city entry avoids repeated hash look‑ups.
$station = &$stations[$city];
$station[3]++;
$station[2] += $temp;
// instead of $stations[$city][3]++ etc.Runtime improves to 17 minutes 48 seconds (≈10 % faster).
Combine min/max checks
Merge the two separate comparisons into a single conditional block, shaving another ~2 %.
Explicit type casting
Convert the temperature string to a float once, rather than on each use.
$temp = (float)substr($data, $pos + 1, -1);Runtime drops to 13 minutes 32 seconds (≈21 % faster).
Enable PHP JIT
In CLI mode OPCache is disabled by default. Enable it and allocate a JIT buffer:
opcache.enable_cli=1
opcache.jit-buffer-size=10MWith JIT the script finishes in 7 minutes 19 seconds (≈45.9 % reduction).
Multithreaded parallel processing
Because each line can be processed independently, the file is split into $threads_cnt chunks aligned on newline boundaries. Each chunk is processed by a separate \parallel\Runtime worker. The workers return partial $stations maps, which the main thread merges.
<?php
$file = 'measurements.txt';
$threads_cnt = 16;
function get_file_chunks(string $file, int $cpu_count): array {
$size = filesize($file);
$chunk_size = $cpu_count === 1 ? $size : (int)($size / $cpu_count);
$fp = fopen($file, 'rb');
$chunks = [];
$chunk_start = 0;
while ($chunk_start < $size) {
$chunk_end = min($size, $chunk_start + $chunk_size);
if ($chunk_end < $size) {
fseek($fp, $chunk_end);
fgets($fp); // move to next
$chunk_end = ftell($fp);
}
$chunks[] = [$chunk_start, $chunk_end];
$chunk_start = $chunk_end + 1;
}
fclose($fp);
return $chunks;
}
$process_chunk = function (string $file, int $chunk_start, int $chunk_end): array {
$stations = [];
$fp = fopen($file, 'rb');
fseek($fp, $chunk_start);
while ($data = fgets($fp)) {
$chunk_start += strlen($data);
if ($chunk_start > $chunk_end) break;
$pos = strpos($data, ';');
$city = substr($data, 0, $pos);
$temp = (float)substr($data, $pos + 1, -1);
if (isset($stations[$city])) {
$station = &$stations[$city];
$station[3]++;
$station[2] += $temp;
if ($temp < $station[0]) $station[0] = $temp;
elseif ($temp > $station[1]) $station[1] = $temp;
} else {
$stations[$city] = [$temp, $temp, $temp, 1];
}
}
return $stations;
};
$chunks = get_file_chunks($file, $threads_cnt);
$futures = [];
for ($i = 0; $i < $threads_cnt; $i++) {
$runtime = new \parallel\Runtime();
$futures[$i] = $runtime->run($process_chunk, [$file, $chunks[$i][0], $chunks[$i][1]]);
}
$results = [];
for ($i = 0; $i < $threads_cnt; $i++) {
$chunk_result = $futures[$i]->value();
foreach ($chunk_result as $city => $measurement) {
if (isset($results[$city])) {
$result = &$results[$city];
$result[2] += $measurement[2];
$result[3] += $measurement[3];
if ($measurement[0] < $result[0]) $result[0] = $measurement[0];
if ($measurement[1] > $result[1]) $result[1] = $measurement[1];
} else {
$results[$city] = $measurement;
}
}
}
ksort($results);
echo "{
";
foreach ($results as $k => &$station) {
echo "\t", $k, '=', $station[0], '/', ($station[2]/$station[3]), '/', $station[1], ",
";
}
echo "}
";
?>With 16 threads the total runtime is 1 minute 35 seconds . After recompiling PHP 8.3 with optimized CFLAGS and using 10 threads, the final runtime reaches 27.7 seconds .
Key take‑aways
High‑level helpers such as fgetcsv() hide costly operations; low‑level line reading ( fgets()) gives full control.
Reference variables and explicit type casting reduce hash look‑ups and repeated conversions.
Enabling JIT in CLI can dramatically speed up CPU‑bound workloads.
The parallel extension allows near‑linear scaling when the work is cleanly split.
Performance tuning is layered: algorithmic changes, language‑level options, and hardware‑aware parallelism all contribute.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Tech Hub
Sharing cutting-edge internet technologies and practical AI resources.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
