How UCloud’s Low‑Cost Archival Storage Cuts Power Use with Smart Disk Management
This article explains how UCloud’s archival storage leverages inexpensive SMR disks, JBOD hardware, erasure‑coding blobs, and a power‑aware scheduling strategy to store massive cold data cheaply while still providing flexible, near‑real‑time retrieval when needed.
Cloud computing’s massive distributed capabilities make it ideal for handling huge data volumes, but most stored data is "cold" and accessed infrequently. Archival storage moves such data to low‑cost, long‑term media, accepting longer retrieval times.
Traditional tape or optical libraries stored data in rarely accessed corners of data centers. Modern demands for flexible, near‑real‑time retrieval have driven new solutions.
Hardware Architecture
UCloud’s system uses two heads connected to multiple JBOD enclosures; each JBOD holds over a hundred disks. JBOD (Just a Bunch Of Disks) provides individual disk addressing without RAID logic, allowing each disk to be accessed directly.
The chosen disks are primarily HM‑SMR (Host‑Managed SMR) with optional CMR compatibility. SMR zones are written sequentially, making them cheap but unsuitable for random writes, ideal for large, rarely‑changed data.
Power consumption is a major cost. By keeping most disks powered down and only powering a small subset for I/O, the design reduces electricity usage while meeting a 5‑year warranty target of 50 k power cycles per disk.
Software Architecture
To save cost, UCloud employs erasure‑coding (EC) rather than replication. Data is split into EC fragments and stored across logical disk groups. A "Blob" groups EC fragments and SMR zones, enabling efficient deletion or compression at the Blob level.
Disk groups contain a fixed number of disks (e.g., 23 + 3 for a 23+3 EC scheme). Writes stay on the current group until it reaches capacity, then the system rotates to the next group, allowing groups to power down when idle.
Metadata uses the 1 % of CMR zones that support random reads to store Disk Meta (disk ID, JBOD location, zone layout) and Zone Meta (zone index, usage flags).
Power‑Scheduling Strategy
Only the disk group handling current writes remains powered on. When a group reaches a write threshold, it powers down and the next group powers up. For reads, non‑urgent requests trigger a read‑mark; groups with marks are powered on during hourly scans. Urgent reads cause immediate power‑up, with a one‑hour grace period where the group stays on.
I/O Flow
Incoming I/O is split into EC fragments and dispatched to the appropriate disks. Writes that fill a Blob cause a zone switch and start a new Blob. The system returns the Blob ID and offset for upper‑layer metadata.
Data Persistence
Data is written in 4 KB sectors; each sector ends with a Sector Meta containing zone ID and CRC for error detection, protecting against silent disk failures.
Periodic Data Checks
After startup, the service periodically scans fully written Blobs, verifying each sector’s CRC via the stored metadata. Failures generate alerts for operations staff.
Conclusion
The UCloud archival storage solution delivers high performance and security while dramatically lowering cost, making it suitable for large, infrequently accessed datasets such as backup media, logs, and user archives. Launched in 2019, it has operated stably for years and is expected to further reduce storage expenses as adoption grows.
UCloud Tech
UCloud is a leading neutral cloud provider in China, developing its own IaaS, PaaS, AI service platform, and big data exchange platform, and delivering comprehensive industry solutions for public, private, hybrid, and dedicated clouds.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
