Databases 15 min read

Introducing DataFusion: A High‑Performance Rust‑Based Query Engine Powered by Apache Arrow

This article explains DataFusion, a Rust‑written, Arrow‑based query engine that offers high performance, extensibility, and seamless integration with various data sources, detailing its architecture, execution model, Rust advantages, and practical usage examples for building modern data‑warehouse solutions.

360 Tech Engineering

Oct 17, 2024

Introducing DataFusion: A High‑Performance Rust‑Based Query Engine Powered by Apache Arrow

DataFusion is a query engine without its own storage, designed to be flexible and extensible by leveraging Apache Arrow's columnar memory model. It natively supports CSV, Parquet, Avro, JSON and can read from local files, AWS S3, Azure Blob, and Google Cloud Storage, while providing rich extension points for custom data formats and sources.

Key Features

High performance: built in Rust, garbage‑collection‑free, offering C++‑level speed with Java/Golang‑like development productivity; uses Arrow’s columnar storage for vectorized computation.

Simple integration: part of the Apache Arrow ecosystem, it works smoothly with other big‑data components.

Easy customization: users can add scalar/aggregate/window functions, data sources, SQL extensions, custom plans, and optimizer rules.

The Qilin data‑warehouse adopts DataFusion to become the preferred query engine for databases, data‑frame libraries, and machine‑learning workloads, to enable custom data sources, inverted indexes, and accelerated query execution.

Why Rust? Rust is a systems language that provides memory safety without a garbage collector, using an ownership model, borrow checker, and the Option<T> type to eliminate null‑pointer errors. Its "zero‑cost abstractions" deliver high‑level expressiveness with performance comparable to C++.

Apache Arrow Overview

Arrow is a cross‑language, in‑memory columnar format created to solve data‑transfer inefficiencies across systems. It enables zero‑copy shared memory, supports many file formats (CSV, ORC, Parquet), and provides in‑memory analytics and query processing.

Arrow’s memory layout stores each column contiguously, reducing cache misses and allowing SIMD acceleration, unlike traditional row‑oriented layouts.

Core Arrow data structures include:

Buffer : the lowest‑level contiguous memory region.

DataType and Array : describe column types; each Array holds column data backed by one or more Buffers.

RecordBatch and Schema : a RecordBatch is a collection of same‑length Arrays; Schema defines the structure of those Arrays.

Table : built on top of RecordBatches.

DataFusion Architecture

The engine consists of:

SQL parsing (using sqlparser) to produce an abstract syntax tree, then a logical plan.

Logical plan optimization via AnalyzerRules and OptimizerRules (e.g., projection push‑down, filter push‑down).

Physical planning that converts the logical plan into an ExecutionPlan.

Physical optimization (e.g., sort order, join selection) and execution using Arrow arrays.

Execution plans produce one or more partitions of SendableRecordBatchStream, supporting streaming, parallelism (via Tokio tasks), memory pooling (GreedyPool or FairPool), and caching of metadata.

Execution Engine Features

Streaming execution: operators emit Arrow RecordBatches incrementally.

Parallel execution: each ExecutionPlan runs on multiple streams, coordinated by Tokio.

Thread scheduling with Tokio async runtime.

Memory management via a shared MemoryPool.

Cache management for object‑store listings and file metadata.

Extensibility: custom data sources, catalogs, query languages, UDFs, optimizer rules, and physical plans.

Projects built on DataFusion include blaze‑rs (Spark‑to‑DataFusion optimizer), LakeSoul (cloud‑native lake‑warehouse), and Ballista (distributed query engine).

DataFusion CLI

The command‑line tool allows interactive SQL queries over local or remote files (CSV, Parquet, JSON, Arrow, Avro) and supports reading from S3.

Rust Programming Example

[dependencies]
datafusion = "40.0.0"
tokio = { version = "1.0", features = ["rt-multi-thread"] }

Execute a SQL query:

#[tokio::main]
async fn main() -> datafusion::error::Result<()> {
    let ctx = SessionContext::new();
    ctx.register_csv("example", "tests/data/example.csv", CsvReadOptions::new()).await?;
    let df = ctx.sql("SELECT a, MIN(b) FROM example WHERE a <= b GROUP BY a LIMIT 100").await?;
    df.show().await?;
    Ok(())
}

Or use the DataFrame API:

#[tokio::main]
async fn main() -> datafusion::error::Result<()> {
    let ctx = SessionContext::new();
    let df = ctx.read_csv("tests/data/example.csv", CsvReadOptions::new()).await?;
    let df = df.filter(col("a").lt_eq(col("b")))?
               .aggregate(vec![col("a")], vec![min(col("b"))])?
               .limit(0, Some(100))?;
    df.show().await?;
    Ok(())
}

Custom data source implementation requires defining a TableProvider trait with a scan method that returns an ExecutionPlan, and optionally overriding supports_filters_pushdown.

pub trait TableProvider: Sync + Send {
    async fn scan(
        &self,
        state: &SessionState,
        projection: Option<&Vec<usize>>,
        filters: &[Expr],
        limit: Option<usize>,
    ) -> Result<Arc<dyn ExecutionPlan>>;
    // ... other methods ...
}

impl ExecutionPlan for CustomExec {
    fn execute(&self, _partition: usize, _context: Arc<TaskContext>) -> Result<SendableRecordBatchStream> {
        // custom execution logic
    }
}

The article concludes with a brief promotion of the 360 Zhihui Cloud platform, which is unrelated to the technical content.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

SQL Rust Data Warehouse Apache Arrow Query Engine DataFusion

Written by

360 Tech Engineering

Official tech channel of 360, building the most professional technology aggregation platform for the brand.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.