Cloud Native 19 min read

How Alibaba’s DADI Transforms Cloud‑Native Container Storage and Image Acceleration

This article examines Alibaba Cloud's cloud‑native storage innovations, including the DADI block‑layer image accelerator, high‑density ESSD cloud disks, and the CNFS container network file system, detailing their architectures, performance benefits, and large‑scale deployment results.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
How Alibaba’s DADI Transforms Cloud‑Native Container Storage and Image Acceleration

Storage Trends in Cloud‑Native Environments

Container density, elasticity, and rapid mount/unmount are critical as workloads shift from VMs to serverless functions.

High density: Serverless splits applications into many small containers, requiring higher storage density.

Elasticity: Massive concurrent container startups need instantly scalable storage.

Speed: Short‑lived containers demand fast mount/unmount and on‑demand data access.

Challenges of Large‑Scale Container Deployment at Alibaba

Alibaba operates clusters up to 100,000 nodes. Container images often exceed tens of gigabytes, and rapid scaling events (e.g., Double‑11) expose bottlenecks:

Time cost proportional to image size × node count.

CPU overhead from serial gzip decompression.

I/O pressure from simultaneous download and write.

Memory pressure on host page cache.

Only ~6.4% of image data is needed at startup.

Key Requirements for Scalable Deployment

On‑demand: Fast download/decompression and selective data transfer.

Incremental layering: Use OCI‑Artifacts overlayfs to transmit only changed blocks.

Remote image: Adopt remote‑image technology to avoid storing full images locally.

Remote Image Technical Options

Two implementations exist:

File‑system based: Provides a native file‑system interface but is complex, less portable, and has a larger attack surface (e.g., Google CRFS, Azure Project Teleport, AWS SparseFS).

Block‑device based: Works with standard file systems (ext4, NTFS), supports containers and VMs, offers better stability and a smaller attack surface.

Alibaba selected the block‑device approach, naming it Date Accelerator for Disaggregated Infrastructure (DADI) .

DADI Architecture and Innovations

DADI treats an image as a virtual block device. Containers mount a regular file system (e.g., ext4) on this device. Reads are intercepted, translated into block reads, and served by a user‑space DADI module that fetches required layers on demand.

Overlay Block Device: Each layer records only changed variable‑length blocks (minimum 512 B), enabling fast indexing and low memory usage.

Writable Layer Support: Provides append‑only and random‑write sparse files for read‑write layers.

ZFile compression: Fixed‑size block compression allowing random access without full decompression; supports lz4, zstd, gzip.

Trace‑based prefetch: Records read locations during cold starts; on subsequent starts DADI prefetches needed blocks using high‑concurrency reads.

On‑demand P2P transfer: Recent blocks are cached locally; missing blocks are fetched from peers in a tree topology, reducing load on central registries.

Performance Evaluation

Using a WordPress image, DADI reduced cold‑start latency compared with traditional .tgz images, Slacker, CRFS, LVM, and P2P downloads. In a test with 1,000 VMs each running 10 containers (10,000 containers total), cold starts were under 3 seconds.

Production Deployment at Alibaba

DADI runs on tens of thousands of hosts, launching 10,000 containers in 3–4 seconds during peak traffic. It now supports over 100,000 servers across Alibaba’s business lines, improving deployment agility and elasticity.

Related Cloud‑Native Storage Innovations

ESSD Cloud Disk

High‑density block storage offering up to 1 million IOPS and 4 GB/s throughput per instance, with auto‑scaling performance (ESSD Auto PL) and nine‑nine reliability.

CNFS (Container Network File System)

Integrated into Alibaba Cloud Kubernetes (ACK), CNFS abstracts Alibaba Cloud NAS as a Kubernetes CRD, providing declarative lifecycle management, online/auto expansion, snapshots, encryption, and fine‑grained access control.

Best‑Practice Cases

Database containers on ESSD achieve 4× higher disk‑mount density, up to 100 TB IOPS, with instant snapshot‑based cloning.

Prometheus monitoring stores TSDB on shared NAS via CNFS, delivering high availability, zero code changes, and elastic scaling.

References

For technical details, see the open‑source repositories:

https://github.com/alibaba/accelerated-container-image

https://github.com/alibaba/overlaybd

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Kubernetesimage accelerationcontainer storageDADICNFSESSD
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.