LightPool: A Cloud‑Native NVMe‑oF Based High‑Performance Storage Pool Architecture for Distributed Databases
The article introduces LightPool, an open‑source cloud‑native storage‑pool architecture built on NVMe‑over‑Fabric that delivers high performance, low cost, and high availability for large‑scale distributed databases, and explains its design, scheduling, storage engine, and hot‑upgrade/migration capabilities presented at the 30th IEEE HPCA conference.
At the 30th IEEE International Symposium on High‑Performance Computer Architecture (HPCA) in Edinburgh, a paper titled "LightPool: A NVMe‑oF‑based High‑performance and Lightweight Storage Pool Architecture for Cloud‑Native Distributed Database" was accepted, showcasing an open‑source storage solution (LiteIO) developed by Alibaba Cloud Server R&D and Ant Data Infrastructure teams.
Paper Background – Modern databases face intense pressure on storage performance, cost, and stability. To address these challenges in a cloud‑native environment, the authors propose a novel storage‑pool architecture that combines the efficiency of local SSDs with the elasticity of cloud‑native designs, reducing storage costs while maintaining high performance.
Problem Statement – Traditional storage choices for databases include compute‑storage coupling, compute‑storage separation (ECS + EBS/S3), and shared‑storage solutions, each with drawbacks such as resource fragmentation, high latency, or limited portability. The authors aim to eliminate these issues by pooling idle storage resources across heterogeneous ECS nodes.
LightPool Architecture – The system consists of control nodes and worker nodes. Control nodes manage the global SSD pool, interact with the Kubernetes master, and are bypassed during normal I/O. Worker nodes run containers, host the LightPool storage engine and CSI plugin, and expose pooled storage to containers via NVMe‑over‑Fabric.
Cloud‑Native Scheduling Design – LightPool integrates with Kubernetes scheduling. Controllers maintain a view of all storage pools, allocate volumes, and use a two‑stage scheduler (filter + priority) similar to pod scheduling. Filters consider pool health, remaining capacity, and affinity tags; priorities favor nodes with less used storage. Custom filters and priorities can be configured.
Storage Engine Design – The engine is a user‑space, lightweight component that supports zero‑copy local storage protocols, multiple media types (including QLC SSDs and ZNS), and features such as snapshots and RAID. By bypassing TCP for local paths, it reduces CPU overhead and latency.
High‑Availability Features – LightPool implements hot‑upgrade (sub‑second upgrade time without downtime) and hot‑migration (seamless data movement between worker nodes when a node fails or for load balancing). Agents on each node report health via Kubernetes Lease objects, enabling rapid detection of failures.
Open‑Source Availability – The LiteIO project, which implements LightPool, is publicly available at https://github.com/eosphoros‑ai/liteio, inviting community contributions and adoption.
AntData
Ant Data leverages Ant Group's leading technological innovation in big data, databases, and multimedia, with years of industry practice. Through long-term technology planning and continuous innovation, we strive to build world-class data technology and products.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.