Cloud Native 7 min read

How Alibaba Cloud’s CSI Layered Storage Delivers SSD Speed with Cloud‑Disk Reliability

In the cloud‑native era, Alibaba Cloud’s CSI‑based hierarchical storage combines local NVMe SSD performance with cloud‑disk durability, offering a three‑layer design, operational simplicity, and up to 100× IOPS gains for database and AI workloads.

Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
How Alibaba Cloud’s CSI Layered Storage Delivers SSD Speed with Cloud‑Disk Reliability

Background

In cloud‑native environments, database and AI workloads require both the ultra‑low latency of local NVMe SSDs and the durability of cloud block storage. A hierarchical storage solution built on the Kubernetes Container Storage Interface (CSI) combines these properties.

Design Trade‑offs

Local NVMe SSD : provides millions of IOPS and sub‑millisecond latency but data is lost if the node fails.

Cloud block storage (e.g., Alibaba Cloud ESSD/EBS) : offers persistence, snapshots and elastic scaling, but performance is limited by network bandwidth.

Architecture – Three‑Layer CSI Driver

The driver runs as a standard CSI plugin on each node and uses Linux dm‑cache to present a single virtual block device that merges three layers:

Origin layer : the remote cloud block device.

Cache layer : a fast local SSD (or RAID‑0 of multiple NVMe disks).

Metadata layer : a small loop device that stores dm‑cache metadata.

Key implementation steps:

Aggregate one or more NVMe disks into a RAID‑0 array with mdadm to increase bandwidth.

Format the RAID device with XFS and enable Allocation Groups for high‑concurrency I/O.

Pre‑allocate a file on the cloud disk using fallocate, then expose it as a block device with losetup. This preserves the original cloud‑disk format.

Create three block devices (metadata, cache, origin) and combine them with dm‑cache, exposing /dev/md‑0 to containers.

Operational Benefits

Online elastic scaling : both cache size and cloud‑disk capacity can be expanded without pod disruption.

Multi‑attach support : because metadata resides locally, the same cloud disk can be attached read‑only to multiple nodes.

Fast cloud‑backup expansion : snapshots and capacity growth are performed on the origin layer while the cache continues serving I/O.

Automatic failover : if no node with local SSD is available, workloads are scheduled on instances that use only the cloud disk.

Zero‑intrusion migration : the CSI driver presents a standard block‑storage interface; existing applications can mount it without modification.

Minimal daemon footprint : no additional user‑space daemons are required, reducing CPU and memory overhead.

Performance Evaluation

Tested with a 120 GB Alibaba Cloud ESSD volume as the origin layer and a 100 GB local SSD (RAID‑0) as cache. Results:

Random read : 1,620 IOPS (baseline) → 224,000 IOPS with dm‑cache.

Sequential write (write‑back) : 132 MiB/s → 3,500 MiB/s.

Note: dm‑cache may bypass the cache for pure sequential reads to protect SSD endurance, but the observed gains in random reads and sequential writes are significant for latency‑sensitive workloads.

Comparison with Alternative Approaches

LVM : embeds metadata in the cloud‑disk header, breaking portability and requiring manual cleanup during pod migration.

Self‑managed Ceph or similar clusters :

Reliability : larger fault domain; risk of cascade failures.

Complexity : requires dedicated storage operators; CSI driver automates OSD/monitor management.

Performance : local‑bus access avoids 10 Gb/25 Gb network bottlenecks.

Functionality : retains cloud‑disk features such as snapshots and elastic scaling.

Reference Implementation

The driver source code and deployment manifests are available at:

https://github.com/kubernetes-sigs/alibaba-cloud-csi-driver

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

cloud-nativeKubernetesStorageCSINVMe
Alibaba Cloud Infrastructure
Written by

Alibaba Cloud Infrastructure

For uninterrupted computing services

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.