How Generators Slash PHP Memory Usage When Processing Massive CSV Files
This article explains how PHP lazy evaluation using generators and the Iterator API can dramatically reduce memory consumption when loading huge CSV files, provides side‑by‑side code examples and benchmarks, and offers guidance on when to choose each approach for real‑world data processing tasks.
What Is Lazy Evaluation?
Lazy evaluation is a technique that generates and processes data only when it is actually needed, avoiding the upfront loading of entire data sets into memory.
Reading Large CSV Files Easily
Instead of using file() or fgetcsv() to load the whole file, a generator can read one line at a time, keeping memory usage at a few kilobytes even for a 2 GB CSV file.
function readCsv(string $filename): Generator {
$handle = fopen($filename, 'r');
if ($handle === false) {
throw new RuntimeException("Cannot open file $filename");
}
while (($row = fgetcsv($handle)) !== false) {
yield $row;
}
fclose($handle);
}
foreach (readCsv('data.csv') as $row) {
// process $row, e.g. echo implode(', ', $row) . PHP_EOL;
}Benchmark: Array vs. Generator
When generating one million numbers, the array version consumes about 120 MB of memory, while the generator version uses less than 1 KB.
// Array benchmark
$startMemory = memory_get_usage();
$array = range(1, 1_000_000);
echo "Array: " . (memory_get_usage() - $startMemory) / 1024 / 1024 . " MB";
unset($array);
// Generator benchmark
$startMemory = memory_get_usage();
function bigGenerator(): Generator {
for ($i = 1; $i <= 1_000_000; $i++) {
yield $i;
}
}
foreach (bigGenerator() as $n) {}
echo "Generator: " . (memory_get_usage() - $startMemory) / 1024 / 1024 . " MB";When to Use Which Method?
Need on‑demand data transmission → Generator
Need complex logic or internal state → Iterator API
Reading large files or database streams → Generator
Repeated iteration while preserving state → Iterator API
Real‑World Production Case
A car‑sales API returned hundreds of thousands of records. Loading all records into an array exhausted 1–2 GB of memory. Switching to a generator reduced memory usage to about 10 MB without noticeable impact on execution time.
function fetchCars(): Generator {
$page = 1;
do {
$data = apiRequest('cars', ['page' => $page]);
foreach ($data['items'] as $car) {
yield $car;
}
$page++;
} while (!empty($data['items']));
}Conclusion
Generators and the Iterator API are essential tools for modern PHP backend development. They enable processing of millions of records with minimal memory footprint. Use generators for simple streaming scenarios and the Iterator API when you need finer control over iteration state.
Author: 场长
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
