Big Data 24 min read

Velox Memory Management and Execution Engine Overview

This article presents a comprehensive overview of Meta's open‑source Velox query execution engine, detailing its architecture, vectorized execution model, memory‑pool hierarchy, arbitrator and allocator designs, spilling techniques, and future development plans for large‑scale data processing.

DataFunSummit
DataFunSummit
DataFunSummit
Velox Memory Management and Execution Engine Overview

The presentation introduces Velox, an open‑source high‑performance C++ query execution engine developed by Meta, which focuses on vectorized processing, columnar memory layout, and a push‑based pipeline model to efficiently execute physical plans on local resources.

It explains Velox's vision to unify execution across diverse data workloads (transactional, analytical, streaming, and time‑series) within Tencent's big‑data platform, aiming to reduce engineering fragmentation and improve user experience.

The execution model is described as a pipeline of operators (e.g., table scan, projection, hash join) that operate on columnar data, with drivers executing operators in parallel threads, enabling fine‑grained scheduling and efficient CPU utilization.

Memory management is a core component, consisting of three parts: a memory pool that tracks usage per query, an arbitrator that coordinates and reclaims memory across queries, and an allocator that handles physical memory allocation using size‑class and quantization techniques.

Memory pools are organized hierarchically into leaf pools (actual allocation) and aggregate/root pools (capacity control), allowing precise tracking of memory at the operator level and supporting fair sharing among concurrent queries.

The arbitrator performs local and global arbitration: local arbitration reclaims unused capacity within the same query, while global arbitration can reclaim memory from other queries, triggering spill operations when necessary.

The allocator adopts concepts from the Umbra paper, using lazy size‑class buffers and mmap‑based virtual memory to minimize fragmentation and efficiently serve both small and large allocation requests.

Spilling techniques are detailed, including the handling of RowContainers, hash aggregation, and row‑number operators, with recursive spill strategies to manage extreme memory pressure without query failure.

Future plans include deeper integration with Tencent's data platforms, extending Velox to support more file formats (e.g., Parquet, Iceberg), improving shuffle performance, and exploring GPU‑CPU co‑processing for structured data workloads.

The article concludes with a Q&A session addressing query‑submission resource estimation and the distinction between local and global arbitrators.

Memory Managementquery executionVeloxSpillingVectorized Engine
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.