Tencent Cloud Block Service: Evolution from CBS1.0 to Two‑Layer CBS3.0
This article traces the evolution of Tencent Cloud Block Service (CBS), detailing the transition from the initial CBS1.0 built on three distributed systems, through the streamlined CBS2.0, to the cost‑effective two‑layer CBS3.0 architecture, and discusses the technical challenges, solutions, and operational outcomes.
What Is Cloud Disk?
Cloud Disk, also called Cloud Block Service (CBS), is a virtual hard drive in the cloud that provides all the capabilities of a physical disk: creating file systems, storing media, reading and writing files, and advanced features such as snapshots that allow users to roll back data to a previous state.
Users can simply purchase a cloud VM on Tencent Cloud and attach a cloud disk to it.
How Was the Cloud Disk System Implemented?
Initially, CBS relied on three existing Tencent distributed systems: TFS for cold data storage, TSSD for hot data storage, and CKV for distributed locking. By integrating these, the CBS backend was created.
CBS1.0 combined the three systems and exposed an iSCSI block storage service at the front end. However, the product suffered from a long I/O chain, heavy operational overhead, and poor availability.
CBS2.0 simplified the integration at the code level, reducing complexity and improving reliability.
CBS2.0 Architecture
The front‑end consists of an access cluster with Client (iSCSI initiator), Proxy (iSCSI target), and Master (cluster controller). The back‑end is a distributed storage cluster with Access, Chunk (data storage), and its own Master.
Operational Status of CBS2.0
The system has been running safely online for a long time, serving hundreds of thousands of commercial customers, with hundreds of thousands of cloud disks and a storage scale of hundreds of petabytes. Tencent Cloud Disk offers three product specifications—HDD, HDD+SSD hybrid, and SSD—with reliability of eight nines.
Operational Challenges
Key issues include cost optimization and high‑performance latency caused by multiple layers (client, proxy, storage, etc.), where network latency becomes comparable to SSD operation latency.
Solution: Two‑Layer Architecture (CBS3.0)
To address these problems, the access layer was removed, resulting in a two‑layer design: the Client remains on the host, the Chunk module provides data storage, and the Master node manages the cluster.
The design follows a “do‑less, do‑later, do‑only‑when‑necessary” principle, leading to a lazy routing synchronization algorithm.
CBS3.0 Software Logic Architecture
Driver – corresponds to the Client module on the host. Chunk – provides a three‑replica storage engine. Master – the high‑availability control node.
Technical Challenges of the Two‑Layer Design
Key challenges are data organization, data routing, and routing synchronization.
Data Organization
CBS introduces a virtual partition concept. Physical disks are divided into fixed‑size blocks; logically, multiple blocks form a Partition. This abstraction allows flexible management while keeping the physical layout stable.
Data Routing
When a client accesses data, it sends disk‑id, block‑id (or LBA), and snapshot‑id. These three parameters are hashed consistently to determine the target Partition, whose mapping to physical servers is configured during cluster initialization.
Routing Synchronization (Lazy Sync)
In the two‑layer architecture, every Client needs routing information, which would overwhelm the Master if pushed eagerly. Instead, the Master pushes routing updates only to the Chunk module; Chunk stores them locally. When a Driver detects a version mismatch, it requests an update from Chunk, which may forward the request to the Master if necessary. This “lazy” approach reduces unnecessary work while ensuring consistency.
The algorithm is called lazy routing synchronization.
Impact of CBS3.0
After deployment, the cost of ordinary cloud disks dropped by 46 %. CBS3.0 supports all three product types (standard, high‑efficiency, and SSD) on a unified platform.
Industry Comparison
Compared with Ceph, CBS delivers higher performance, finer‑grained operations, and stronger data safety guarantees, though Ceph offers features like erasure coding that are advantageous for cost‑sensitive scenarios.
Conclusion
The evolution of CBS demonstrates a “less is more” philosophy: simplifying architecture while delivering high performance, reliability, and ease of use, ultimately creating a win‑win for both the service and its users.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
