CynosDB Architecture and Optimization: A PostgreSQL-Compatible NewSQL Database
CynosDB, Tencent’s PostgreSQL‑compatible NewSQL service, separates compute and storage, uses a log‑based distributed CynosStore with idempotent logs, offloads CRC checks, and implements async table extension, eliminating full‑page writes and dirty‑page flushing to deliver scalable, cost‑effective performance while preserving PostgreSQL features.
This article introduces Tencent's self-developed database product CynosDB, which comes in PostgreSQL and MySQL versions. Using the PostgreSQL-compatible version as an example, the article details the architecture design and optimization approaches.
Overview: PostgreSQL is the world's most advanced open-source database, with a community evolution history of over 30 years. Its advanced architecture, reliability, and rich features have been highly recognized in the industry. The PostgreSQL-compatible CynosDB, as a NewSQL product, maintains excellent scalability. Its resource pooling architecture allows users to achieve equivalent performance at lower costs without losing PostgreSQL's original functional characteristics.
Basic Architecture: Traditional cloud databases have two main shortcomings: heavy network I/O (WAL LOG, dirty pages, Double Write or Full Page Write) and non-shared data between master and slave instances, wasting storage and加重 network I/O. CynosDB solves these problems through log sinking and shared storage. The architecture includes: master (primary instance handling read/write transactions), slave (read-only instances handling read requests), CynosStore Client (providing access to distributed storage CynosStore), CynosStore (distributed storage system), cluster management services, and cold backup storage.
Compute Layer Architecture: CynosDB implements storage-compute separation, dividing the system into compute layer and storage layer. The compute layer is responsible for SQL parsing and log generation, while the storage layer handles data storage, log archiving, and log merging. The compute layer includes: SQL engine (lexical/semantic analysis, query rewriting/optimization, and execution), Access layer (table and index implementation including Heap, btree/gin/gist/spgist/hash/brin, CLOG/MultiXACT), storage/buffer (buffer pool and storage management), and CynosStore Client.
Architecture Optimization:
4.1 Log System: CynosDB's storage CynosStore is a log-based distributed block device supporting multi-version reads. The log format is <page number, page offset, modification content, modification length>, making logs idempotent. Optimizations include: removing Full Page Write (FPW) characteristic - due to log idempotency, torn pages can be recovered by replaying logs, eliminating the need for FPW; removing dirty page flushing - logs preserve page modifications, and the latest page can be obtained by merging logs on the base page; log header merging and log merging - multiple logs modifying the same page share one log header, and adjacent logs can be merged into one.
4.2 Page CRC: In PostgreSQL, page CRC is calculated before flushing to disk. In CynosDB, CRC calculation is offloaded to the storage layer, reducing the compute node's CPU burden and log entry count.
4.3 Async Table Extension: Native PostgreSQL synchronously extends files to disk. CynosDB implements async file extension - extension logs are first retained in the system's log buffer and flushed to storage at transaction commit, significantly improving bulk data import performance. Additionally, extension operations can expand multiple pages at once, reducing the number of extension calls.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.