Cloud Native 15 min read

How Alibaba’s overlaybd Revolutionizes Container Startup with On‑Demand Image Reads

Alibaba’s open‑source DADI project introduces the overlaybd image format, enabling on‑demand block‑level reads that dramatically cut container cold‑start times, with a layered architecture, iSCSI‑backed storage, ZFile compression, and a roadmap toward OCI standardization.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Alibaba’s overlaybd Revolutionizes Container Startup with On‑Demand Image Reads

Background

Alibaba recently open‑sourced its cloud‑native container image acceleration technology (accelerated‑container‑image) featuring the overlaybd image format. Unlike traditional layered tar files, overlaybd supports network‑based on‑demand reads, allowing containers to start much faster.

The technology originated from Alibaba Cloud’s internal DADI project (Data Accelerator for Disaggregated Infrastructure), aimed at accelerating data access for disaggregated compute‑storage architectures. Since its production launch in 2019, DADI has been deployed on millions of machines, powering over a billion container starts across Alibaba Group and Alibaba Cloud, dramatically improving application deployment and scaling efficiency. The team published a USENIX ATC'20 paper and subsequently open‑sourced the project to foster a community ecosystem.

Problem Statement

With the explosion of Kubernetes and cloud‑native workloads, container cold‑start—downloading an image from a registry before launch—can take minutes for large multi‑hundred‑MB or GB‑scale images, especially under high concurrency, leading to network congestion and degraded user experience.

Real‑world incidents, such as a Double‑11 sales event where an internal Alibaba app faced prolonged scaling due to image download latency, demonstrated the need for faster image access. After DADI’s deployment, the total time for “image pull + container start” became five times shorter, and the p99 tail latency improved by 17×.

Related Work

Previous attempts include block storage/NAS‑based on‑demand reads, P2P distribution, and pre‑warming techniques. Studies show that image pulling accounts for 76 % of container start time, while data reading consumes only 6.4 %. On‑demand read formats like Google’s stargz (Seekable tar.gz) use lazy‑pull to fetch only needed parts, and provide a containerd snapshotter plugin for further I/O optimization.

New Image Format: overlaybd

overlaybd is a block‑device‑based layered format inspired by overlayfs and union filesystems but implemented as a virtual block device. It offers several advantages over overlayfs:

Avoids performance degradation from many layers, such as costly copy‑up operations for large file updates.

Enables block‑level I/O tracing, recording, and replay for data pre‑fetching.

Supports flexible host file systems, including Windows NTFS.

Allows online decompression with efficient codecs.

Can be stored on distributed cloud storage (e.g., EBS) using the same storage solution for system and data disks.

Provides native read‑write layer support, with read‑only mounts that can serve as historical snapshots.

overlaybd Principle

Container images consist of incremental layers that are stacked at runtime. overlaybd abstracts an image as a virtual block device; when an application reads data, the request is translated by a regular filesystem into block reads, which are then forwarded to the overlaybd runtime to fetch the corresponding layer blocks on demand.

Each layer’s data is stored as a series of data blocks; the latest block for any address wins, while unchanged blocks are treated as zeros. Segments spanning multiple layers are mapped to the appropriate layer files, which can reside in a registry or object storage. To maintain compatibility, overlaybd wraps the outermost layer with a tar header/footer, allowing existing Docker, containerd, or buildkit pipelines to handle the image without code changes.

Overall Architecture

The DADI architecture consists of four main components:

containerd snapshotter

Starting with containerd 1.4, remote image support is available. DADI provides an overlaybd‑snapshotter that enables container engines to mount overlaybd images as virtual block devices while remaining compatible with traditional OCI tar images.

iSCSI target

overlaybd uses iSCSI as a stable, mature remote block protocol. Two target implementations are offered: one based on the open‑source tgt project (user‑space) and another using the Linux kernel LIO SCSI target (kernel‑space), both delivering reliable block device exposure.

ZFile

ZFile is a block‑level compressed format that splits files into fixed‑size blocks, compresses each independently, and maintains a jump table for random access. Supported codecs include LZ4 and ZSTD, enabling fast on‑demand decompression and reducing storage and transfer overhead.

Cache

Layer files are stored in a registry and accessed via HTTP partial content. A local cache progressively downloads layers after container start, persisting them locally to avoid repeated remote reads.

Industry Leadership

According to Forrester’s Q1 2021 FaaS report, Alibaba Cloud ranked first globally, matching AWS. Since containers underpin FaaS platforms, DADI’s acceleration reduces container start latency by 50‑80 %, delivering a markedly better serverless experience.

Conclusion and Outlook

Alibaba’s open‑source DADI project and overlaybd format address the critical need for rapid container startup in modern cloud‑native environments. Future work includes integrating with mainstream toolchains, contributing to OCI remote image standards, and expanding support for additional filesystems.

Future Work

1. Artifacts Manifest – leverage OCI Artifacts Manifest to describe raw data with additional descriptors, improving compatibility.

2. Multi‑filesystem support – expose interfaces to allow users to choose filesystems (default ext4) for image construction.

3. BuildKit integration – refine the snapshotter plugin for seamless BuildKit workflows.

4. Data prefetch – record I/O patterns during a container run and replay them on subsequent starts to halve cold‑start latency.

References [1] https://www.usenix.org/conference/atc20/presentation/li-huiba [2] https://www.usenix.org/conference/fast16/technical-sessions/presentation/harter [3] https://github.com/containerd/stargz-snapshotter [4] http://stgt.sourceforge.net/ [5] http://linux-iscsi.org [6] https://developer.aliyun.com/article/781992
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

iSCSIContainer Imagecontainer accelerationDADIoverlaybdsnapshotter
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.