How Alibaba Cloud Achieved Ultra‑Low Latency with High‑Performance Local Storage
This article details Alibaba Cloud's high‑performance local storage solution for the massive Alipay Red Packet service, covering business requirements, existing block storage limitations, architectural redesign, key components like Virtio‑blk, SPDK, and NVMe SSD, and performance benchmarks demonstrating dramatically lower latency and higher IOPS.
Red Packet Business Characteristics
Alipay's Red Packet service reached a peak of 900,000 requests per second during New Year’s Eve, requiring the database to handle millions of transactions, sub‑100 µs latency, and built‑in disaster recovery, with IOPS needs exceeding 200,000.
Existing Block Storage Products
Alibaba offers SSD cloud disks, efficient cloud disks, and ordinary cloud disks, all with 99.9999999% reliability, but their IOPS (max 20,000) and latency (500 µs) fall short of the Red Packet requirements.
High‑Performance Local Storage
Designed to meet high‑performance database demands, this storage targets ultra‑high IOPS and ultra‑low latency.
Typical Cloud‑Local Storage Architecture
In a standard cloud‑local setup, a MySQL request passes through seven modules from the database to the hardware and back, causing high latency and poor performance.
High‑Performance Storage Architecture
By reducing layers, the request path shortens by 2‑3 modules, improving performance.
Key Components
The standard file system remains unchanged; optimization focuses on the block device layer, driver, and hardware, using Virtio‑blk, SPDK, and NVMe SSD.
Virtio‑blk
Virtio‑blk provides a semi‑virtualized block interface, enabling high‑speed data exchange between virtual and physical machines while remaining transparent to the database.
SPDK
The Storage Performance Development Kit (SPDK) offers a user‑space NVMe protocol, lock‑free design, and polling mode to eliminate interrupt overhead, delivering high throughput.
NVMe SSD
NVMe SSDs use PCI‑e and the NVMe protocol, providing high bandwidth and low latency compared to traditional SCSI.
Data Path
The optimized path keeps the core database, POSIX API, and standard file system unchanged; Virtio‑blk and SPDK drivers interact directly with NVMe SSDs, shortening the data path and boosting I/O performance.
Latency Distribution
Fio tests on CentOS 7 show the high‑performance local disk achieving ~70 µs read and ~30 µs write latency, compared to ~130 µs read and ~60 µs write for generic virtualized disks.
Database Performance Comparison
New database instances on the high‑performance storage reach 26,969 TPS with 1.7 ms response time, versus 14,242 TPS and 8.21 ms on the older setup.
Public Cloud Release
In February, Alibaba Cloud launched the high‑performance local storage publicly, using NVMe SSD + SPDK, becoming the world’s first provider of such a local disk.
Local Disk 2.0 Specs
3 TB capacity, 500 k IOPS, 50 µs latency, 4 GB bandwidth; single‑disk read IOPS up to 24,000, read bandwidth 2 GB/s, write bandwidth 1.2 GB/s.
High‑IO Local Storage Instance
Built on Intel Xeon E5‑2682 v4 CPUs, DDR4 memory, NVMe SSD + SPDK for tens of thousands of random IOPS at microsecond latency, with network performance scaling with compute size.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
