How Alibaba’s OverlayBD Revolutionizes Container Startup with On‑Demand Image Reads
Alibaba’s open‑source DADI project introduces the overlaybd image format, which uses block‑level on‑demand reads, iSCSI targets, and a ZFile compression scheme to cut container cold‑start times by up to five times, offering a scalable solution for cloud‑native workloads.
Background
With the rapid growth of Kubernetes and cloud‑native adoption, containers are deployed at massive scale, and fast startup is a key advantage. While hot‑starts are quick, cold‑starts require pulling full images from a registry, which can take minutes for large multi‑gigabyte images and cause network congestion.
Problem Statement
Large, layered container images lead to slow cold‑start times because the entire image must be downloaded before the container can be instantiated. In high‑traffic events such as Alibaba’s Double‑11 sale, this delay impacted user experience.
Existing Approaches
Prior attempts include storing images on block devices or NAS for on‑demand reads, and using network distribution techniques like P2P or pre‑warming. Research by Harter et al. showed that 76% of startup time is spent pulling images, while only 6.4% is spent reading data, highlighting the need for on‑demand read technologies such as Google’s stargz format.
OverlayBD Design
OverlayBD, part of the DADI (Data Accelerator for Disaggregated Infrastructure) project, replaces traditional layered tar images with a block‑device‑based format that supports on‑demand reads. It builds on concepts from overlayfs and union filesystems but introduces a new block‑level stacking mechanism.
Key Advantages
Avoids performance penalties of deep layer stacks, such as costly copy‑up operations for large files.
Enables block‑level I/O recording and replay for pre‑fetching, further accelerating startup.
Supports flexible host file systems, including Windows NTFS.
Allows efficient online decompression via codecs.
Can be stored on distributed cloud storage (e.g., EBS) using the same storage pool for system and data disks.
Provides native read‑write layer support, making read‑only mounts optional.
Architecture Overview
The DADI architecture consists of four main components:
1. containerd Snapshotter
Starting with containerd 1.4, remote image support is available. The overlaybd‑snapshotter implements the containerd snapshotter interface, allowing containers to mount overlaybd images while remaining compatible with standard OCI tar images.
2. iSCSI Target
OverlayBD uses iSCSI as a reliable remote block device protocol. Two target implementations are provided: one based on the open‑source tgt project and another using the Linux kernel LIO (TCMU) target, both exposing virtual block devices to the container runtime.
3. ZFile
ZFile is a block‑oriented compressed format that splits files into fixed‑size blocks, compresses each independently, and stores a jump table for random access. OverlayBD can export layer files as ZFiles, enabling fast on‑demand decompression.
4. Cache Layer
After a container starts, a background cache downloads layer files via HTTP partial content requests and stores them locally. Subsequent reads hit the cache, eliminating further registry traffic.
Industry Impact
Forrester’s 2021 Q1 FaaS report ranked Alibaba Cloud first in product capability, comparable to AWS. DADI’s overlaybd contributed to a 50‑80% reduction in container startup time for Alibaba Cloud Function Compute, improving serverless performance.
Future Directions
The project plans to:
Adopt the OCI Artifacts Manifest to describe remote image data with additional descriptors.
Expose support for multiple host file systems beyond the default ext4.
Complete the BuildKit snapshotter integration for end‑to‑end image building.
Implement I/O recording and replay to further halve cold‑start latency.
Community contributions are welcomed to help standardize overlaybd as an OCI remote image format.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
