Master Efficient PHP Data Pipelines with the Low‑Memory Flow Framework

This article introduces the Flow PHP data‑processing framework, highlights its ultra‑low memory footprint and extensible pipeline capabilities, and provides step‑by‑step installation and code examples for handling in‑memory arrays and CSV files in ETL workflows.

Open Source Tech Hub
Open Source Tech Hub
Open Source Tech Hub
Master Efficient PHP Data Pipelines with the Low‑Memory Flow Framework

Introduction

Flow is a strongly‑typed PHP library for building large‑scale, low‑memory data‑processing pipelines. It can read from and write to various sources such as databases, CSV, JSON, or native arrays.

Core Features

Strongly‑typed PHP core with minimal memory footprint.

Supports construction of highly scalable pipelines.

Allows data movement between multiple source and target types.

Runs on Linux, macOS and experimental Windows.

Quick Start

Installation

composer require flow-php/etl

Array Example

Read data directly from a PHP array, useful when data is already in memory (e.g., API responses or test fixtures).

<?php
declare(strict_types=1);

use function Flow\ETL\DSL\{data_frame, from_array, to_output};

require __DIR__ . '/vendor/autoload.php';

data_frame()
    ->read(from_array([
        ['id' => 1],
        ['id' => 2],
        ['id' => 3],
        ['id' => 4],
        ['id' => 5],
    ]))
    ->collect()
    ->write(to_output(truncate: false))
    ->run();

CSV File Processing

Read a CSV file, compute daily revenue, and write the aggregated result to a new CSV.

<?php
declare(strict_types=1);

use function Flow\ETL\Adapter\CSV\{from_csv, to_csv};
use function Flow\ETL\DSL\{data_frame, lit, ref, sum, to_output};
use Flow\ETL\Filesystem\SaveMode;

require __DIR__ . '/vendor/autoload.php';

data_frame()
    ->read(from_csv(__DIR__ . '/orders_flow.csv'))
    ->select('created_at', 'total_price', 'discount')
    ->withEntry('created_at', ref('created_at')->cast('date')->dateFormat('Y/m'))
    ->withEntry('revenue', ref('total_price')->minus(ref('discount')))
    ->select('created_at', 'revenue')
    ->groupBy('created_at')
    ->aggregate(sum(ref('revenue')))
    ->sortBy(ref('created_at')->desc())
    ->withEntry('daily_revenue', ref('revenue_sum')->round(lit(2))->numberFormat(lit(2)))
    ->drop('revenue_sum')
    ->write(to_output(truncate: false))
    ->withEntry('created_at', ref('created_at')->toDate('Y/m'))
    ->mode(SaveMode::Overwrite)
    ->write(to_csv(__DIR__ . '/daily_revenue.csv'))
    ->run();

Reference

Project homepage: https://flow-php.com

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backenddata pipelinedata processingETLPHPlow-memoryflow-framework
Open Source Tech Hub
Written by

Open Source Tech Hub

Sharing cutting-edge internet technologies and practical AI resources.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.