Big Data 31 min read

How Baidu’s Aries Cloud Storage Leverages Tape Libraries for Massive Cold Data Archiving

This article explains Baidu Intelligent Cloud’s tape‑library based cold‑data storage architecture, covering tape media basics, the Aries cloud storage system, its modular design, data flow, write and retrieval processes, and a real‑world deployment case that demonstrates cost‑effective petabyte‑scale archival.

Baidu Intelligent Cloud Tech Hub

Sep 4, 2023

How Baidu’s Aries Cloud Storage Leverages Tape Libraries for Massive Cold Data Archiving

1. Tape and Tape Library Overview

In the era of massive data growth, enterprises rely on tape libraries to store historical data cost‑effectively. Baidu Intelligent Cloud has built a cold‑data storage solution based on tape libraries.

1.1 Tape Media

Two images illustrate the evolution of tape media: a 2008 Chinese music cassette (now obsolete) and a 2021 LTO‑9 enterprise tape, showing that tape technology remains active in the enterprise market.

1.2 Enterprise Tape Library

Typical tape‑library front and side views demonstrate cabinet doors, drive slots, cartridge bays, and the robotic arm that moves tapes between storage slots and drives, highlighting the distinct deployment compared with disk‑based servers.

1.3 Main Characteristics of Tape and Tape Libraries

Sequential read/write medium with high throughput (360‑400 MB/s) but high latency for random access.

High reliability and low bit‑error rate, with real‑time verification during writes.

Strong mobility; tapes are offline and easy to transport.

Low cost per terabyte (e.g., LTO‑9 18 TB tape).

Requires specialized hardware and software stacks.

Key tape‑library traits include:

Large capacity (tens to hundreds of petabytes per library).

Relatively low aggregate bandwidth due to limited drive count.

Lower total cost of ownership compared with high‑density disk arrays.

Low power consumption when idle.

Need to integrate vendor‑specific management software.

1.4 Tape Library Software Stack

The stack comprises LTFS (open‑source line‑tape file system) with commercial extensions, proprietary formats, distributed storage systems (e.g., IBM GPFS, Quanta StorNext), and library management software.

1.5 Application Scenarios

Backup (log, database).

Archival of historical data.

Cold data storage for large, infrequently accessed datasets.

Innovative uses such as low‑cost recycle bins.

2. Aries Cloud Storage System Overview

Aries (A Reliable and Integrated Exabytes Storage) serves as the unified data‑plane storage foundation for Baidu’s "Canghai" storage suite.

2.1 Baidu Canghai Storage Stack

The stack consists of a foundation layer (Aries and TafDB) and a product layer (various cloud storage products). Aries provides three data models: Slice, Volume, and Stream.

2.2 Aries Architecture

Aries is divided into four subsystems: resource management, user access, tape‑library storage, and repair/validation. Modules include Master, DataNode, DataAgent, VolumeService, TapeService, TapeNode, and others, enabling micro‑service design and massive scale (over 10 EB total data, single cluster up to 1.7 EB).

2.3 Data & Access Models

Slices are ≤16 MB immutable units; Volumes aggregate slices without order; Streams maintain ordered slices. Access patterns differ: slices are written once, volumes support parallel writes and point reads, streams allow a single writer with ordered writes.

2.4 Aries Concepts

Key concepts include Table Space (group of volumes), Volume & Volumelet (logical and physical containers), and Slice & Shard (encoded fragments). An example EC model 2+1 illustrates the relationship between slices, shards, volumelets, and table spaces.

3. Aries Tape‑Library Architecture

3.1 Data Flow

Business data first writes to Aries’ disk pool, then a dump scheduler moves it to the tape library. Retrieval follows the reverse path, using a cache layer to decouple direct tape access.

3.2 Design Principles

Physical‑grouped writes to improve retrieval efficiency.

Decoupling of user writes from tape dump via disk pool.

Location‑aware retrieval scheduling.

Reuse of existing tape‑library software capabilities.

3.3 Architecture Diagram

Six modules interact: DataAgent (API entry), StateService (volume APIs), TapeService (retrieval), TapeNode (task execution), Master (metadata), and DataNode (disk pool).

3.4 Aggregated Write Process

Volumes (8‑16 GB) are allocated, filled with related slices, sealed, and then handed to the dump scheduler. The dump converts EC‑encoded volumes into linear files stored on GPFS/StorNext, then migrates them to tape using LTFS‑EE.

3.5 Dump Process

Step 1 reads sealed EC volumes, creates a linear file on the tape‑library file system, and appends all slices. Step 2 invokes LTFS‑EE to migrate the file to tape, storing two replicas on separate tapes. Metadata (size, path, tape IDs) is recorded in Master.

3.6 Retrieval Process

The nine‑step retrieval mirrors the dump flow: DataAgent submits a request, TapeService persists and schedules it, TapeNode recalls the file via LTFS‑EE, extracts slices, writes them back to the EC space, reports status, and finally returns data to the business.

4. Business Practice Case

4.1 Scenario & Requirements

Cold data exceeding hundreds of petabytes, low delete probability, need for cost‑effective, reliable, and recoverable storage with 2‑replica redundancy.

4.2 Tape Library Deployment

Two parallel LTO‑8 libraries (each 102 PB) with 44 drives per library, organized into four logical pools (A‑D) and a test pool, providing logical resource pools for Aries.

4.3 Head‑Server Hardware & Software

Software stack: LTFS‑EE, GPFS, TapeNode. Hardware includes dual HBA cards, dual NICs, dual 1.6 TB Optane drives (RAID 1), and system disks.

4.4 Write Performance

From Oct 2022 to Feb 2023 the system ingested ~0.8‑0.9 PB per day until the library was full.

4.5 Retrieval Performance

A benchmark retrieving 124 volumes (~1 TB total) showed the fastest retrieval in 3 minutes and the slowest in 24 minutes, with an average of 14 minutes, disproving the myth of multi‑hour tape retrieval.