Databases 23 min read

Technical Overview and Optimizations of Tencent Cloud CynosDB for MySQL

Tencent Cloud’s CynosDB for MySQL separates compute from distributed storage, using a log‑driven, stateless architecture that eliminates local I/O, enables sub‑second failover, 2.5× write performance, lock‑free structures, async group‑commit, compressed logs, fast parallel recovery, and scalable replication, with future plans for external buffer pools and multi‑master support.

Tencent Cloud Developer

Mar 27, 2019

Technical Overview and Optimizations of Tencent Cloud CynosDB for MySQL

On March 16, Tencent Cloud + Community hosted the CynosDB exchange meeting in Beijing, providing a comprehensive interpretation of CynosDB, revealing technical secrets, and explaining its one‑primary‑multiple‑read architecture, high‑availability design, fast recovery, compute‑intelligent storage, and distributed storage.

Author: Shang Bo, senior CDB engineer at Tencent Cloud with extensive experience in database kernel development, transaction, logging, storage, performance tuning, and SQL compatibility.

This article shares the implementation and optimization of CynosDB for MySQL’s compute‑storage separation architecture. Decoupling compute and storage brings significant gains in performance, scalability, and high availability, while also opening optimization space for both layers.

The overall architecture is introduced first, followed by detailed implementation and related optimizations, and finally the future evolution roadmap of CynosDB.

Traditional databases have historically fought IO bottlenecks. In classic MySQL, the primary node stores data files, log files, binlog files locally and replicates them to replicas, causing duplicated IO and lengthy backup/recovery times, especially for TB‑scale databases.

To address these pain points, many vendors switched to shared storage, but network‑based IO still requires expensive hardware. CynosDB introduces a compute‑storage separation architecture: the storage layer uses shared distributed block storage, while the compute layer unloads unnecessary IO, achieving a log‑driven architecture.

Key characteristics of the architecture are: (1) "Log is the database" – only log streams exist; (2) IO offload – all non‑log IO (data files, biglog, etc.) is removed; (3) Stateless compute – local files disappear, making the compute layer stateless. These lead to high availability from the underlying distributed storage, fast failover within 5‑15 seconds, and large performance headroom.

Log processing is central. Logs are physically split into many small shards stored across different cells. The storage engine pushes logs to storage nodes, which place them in a log queue, persist them, and immediately acknowledge the engine. The engine can then commit part of the transaction. Storage nodes asynchronously replay logs, advance page versions, and perform log recycling.

Performance preview focuses on write throughput. A cloud‑native MySQL instance on Tencent Cloud shows limited performance due to IO. Two on‑premise MySQL setups with NVMe SSDs perform better, especially when disabling innodb_flush_log_at_trx_commit (at the cost of durability). CynosDB, without changing innodb_flush_log_at_trx_commit, delivers 2.5× the performance of cloud‑native MySQL and even surpasses the best on‑premise configuration.

Although network‑based log writes are slower than local SSD writes, CynosDB achieves higher overall performance through several optimizations: unloading unnecessary engine modules, applying classic async/pipeline/batch techniques, extensive lock‑free structures, and bypassing many engine paths.

Traditional thread‑per‑connection models cause heavy context‑switch overhead. CynosDB adopts a thread pool for short‑lived tasks, but long‑waiting IO can still block pool threads. To mitigate this, CynosDB adds an asynchronous group‑commit mechanism: worker threads enqueue transaction requests, immediately return, while a dedicated log‑writer thread persists logs and wakes waiting requests once durability is confirmed.

InnoDB’s MTR (mini‑transaction) creates multiple locks on the public log buffer, which can become a bottleneck. CynosDB partitions the log on the storage side, eliminating the need for a global log‑system lock and reducing contention.

Because all IO is network‑based, log metadata can become a bandwidth bottleneck at high concurrency. CynosDB reduces log overhead by eliminating the 4 % block‑header/footer space, applying lightweight compression, and unloading certain log types (e.g., FIL_NAME, CHECKPOINT), cutting log volume by 20‑30 %.

Checkpoint and recovery differ markedly from traditional databases. In classic MySQL, recovery starts from a checkpoint (VDL) and scans logs sequentially, often taking minutes. CynosDB treats the VDL itself as the checkpoint; after a log is persisted, the storage layer can reconstruct pages, eliminating the need for a separate flush list and allowing parallel, asynchronous recovery on each shard. This reduces system startup and recovery time to under five seconds.

By removing the flush list, CynosDB also eliminates the heavy global flush‑order mutex. Traditional MySQL’s flush‑order lock becomes a hotspot after partitioning the log‑system lock, but CynosDB’s architecture sidesteps this entirely.

Page eviction in the buffer pool is improved by relaxing the LSN condition to VDL‑Δ, allowing faster eviction and effectively “over‑selling” the buffer pool (e.g., 100 GB memory can behave like 110 GB). The Δ value is tuned based on log persistence speed.

CynosDB introduces a new RIO (Remote IO) mechanism: separate synchronous and asynchronous IO queues, lock‑free queue structures, and page‑adjacent merging, fully leveraging the parallelism of distributed block storage.

Space allocation no longer requires pre‑reservation; CynosDB relies on the storage layer’s asynchronous expansion and introduces a fine‑grained segment latch to reduce the critical section of the space lock, speeding up table growth.

Metadata is also offloaded: server‑layer metadata is stored as system tables within InnoDB, avoiding a separate data‑dictionary implementation.

Physical replication uses the log stream rather than binlog. The primary node streams logs to all replicas, which replay them in parallel using multiple threads and a read‑view cache, achieving sub‑millisecond transaction latency and allowing a new replica to be started in 5‑10 seconds (up to 15 replicas limited by bandwidth).

To ensure data correctness despite extensive IO offload, CynosDB provides multi‑layer verification—from transaction‑level checks to physical log and page validation.

Future plans include externalizing the buffer pool from the DB process to further unload state, additional lock optimizations, multi‑master support, and cross‑AZ high‑availability capabilities.

Q&A: The underlying distributed storage handles node scaling; async replay has versioned reads to avoid missing commits; storage is block‑based; log streaming replaces binlog for replication; and the system maintains strong consistency.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance optimization mysql cloud database CynosDB Compute-Storage Separation

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.