Databases 13 min read

Why Cloud‑Native Databases Are Redefining Elasticity and Resilience

Cloud‑native databases address the elasticity, resilience, and high‑availability demands of modern cloud computing by separating compute and storage, leveraging log‑based persistence, multi‑replica consensus, and distributed architectures such as Spanner, Aurora, and TiDB, offering higher performance, lower cost, and better resource utilization.

Qingyun Technology Community

Sep 16, 2021

Why Cloud‑Native Databases Are Redefining Elasticity and Resilience

Background

With the rapid growth of cloud computing, IT applications are moving to the cloud, and cloud services exhibit several characteristics:

Provide on‑demand services.

Users prefer paying operational costs rather than asset costs.

Service provider clusters are increasingly large, often reaching cloud‑scale.

These traits require cloud products to be elastic and to possess self‑healing (resilience) capabilities.

Challenges of Traditional RDS

Initially, databases were simply lifted to IaaS as relational database services (RDS). While this offers some elasticity and resilience, it suffers from low resource utilization, high maintenance cost, and limited availability, making cloud‑native databases essential.

MySQL Replication Example

High‑availability or read/write‑split clusters for MySQL require a binlog replication setup.

The diagram shows write‑ahead logging, redo log, binlog, and relay log writes.

Introduction to Cloud‑Native Databases

To solve the above problems, new generation cloud databases are designed with characteristics such as decoupling, minimal state, and lightweight node expansion.

1. Spanner‑Based Solutions

Google Spanner pioneered cloud‑native databases, inspiring CockroachDB, TiDB, YugabyteDB, etc.

1.1 Architecture

Using TiDB as an example:

These products wrap a distributed SQL execution engine over a key‑value store, employing 2PC or its variants for transaction processing. The compute nodes act as stateless SQL engines, forming a fully distributed database.

1.2 Storage High Availability

Spanner splits tables into tablets and uses multi‑replica Paxos; TiDB uses multi‑replica Multi‑Raft per region; CockroachDB uses Raft per range.

1.3 Pros and Cons

Limited SQL support (e.g., YugabyteDB does not support JOIN).

2. Aurora‑Based Solutions

Aurora, from Amazon, separates compute and storage for MySQL/PostgreSQL, but remains a monolithic read/write‑split cluster.

It adopts Spanner’s log‑persistence idea, treating logs as the database and pushing them to storage.

2.1 Architecture

The green part shows log flow.

Aurora writes only logs from the primary instance; storage applies logs for persistence, eliminating page‑flush and checkpoint overhead.

2.2 High Availability

It uses a quorum voting protocol across three availability zones, allowing continued operation even if one zone fails.

3. CynosDB

CynosDB largely mirrors Aurora’s design, with its own features such as Raft‑based multi‑replica storage and log‑driven buffer cache synchronization.

4. PolarDB

PolarDB also separates compute and storage but keeps redo log handling on the compute side, using existing distributed file systems.

It focuses on storage‑layer optimizations (PolarFS) and query acceleration (FPGA).

5. Socrates

Socrates, Microsoft’s DaaS architecture, reuses SQL Server components, separates log and page storage, and introduces XLogService for log handling.

Leverages SQL Server’s page version store for snapshot isolation.

Uses SSD‑based resilient cache for fast crash recovery.

Implements RBIO protocol for remote page reads.

6. TaurasDB

TaurasDB inherits Aurora’s log‑sink storage and Socrates’ log‑page separation, adding a storage abstraction layer (SAL) and using quorum algorithms for high availability.

Core Functions of Cloud‑Native Databases

Compute‑storage separation with stateless or minimal‑state compute nodes.

Log‑based persistence.

Storage sharding for easy scaling.

Multi‑replica storage with consensus algorithms.

Backup, restore, and snapshot capabilities delegated to the storage layer.

Non‑Core Features of Popular Solutions

Global deployment considerations include multi‑region availability, distributed transactions, and GDPR compliance.

Core Value of Cloud‑Native Databases

Higher performance due to lightweight log‑based replication.

Better elasticity with stateless compute nodes.

Improved availability via fine‑grained replication and consensus.

Higher resource utilization through on‑demand scaling.

Reduced cost from lower resource waste and maintenance overhead.

References

[1] "Amazon Aurora: Design Considerations for High Throughput Cloud‑Native Relational Databases"

[2] "Spanner: Google’s Globally‑Distributed Database"

[3] "TiDB: A Raft‑based HTAP Database"

[4] PolarDB redo replication

[5] PolarDB Architecture

[6] GDPR

[7] "Socrates: The New SQL Server in the Cloud"

[8] "Taurus Database: How to be Fast, Available, and Frugal in the Cloud"

[9] 腾讯云新一代自研数据库 CynosDB 技术详解——架构设计

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

cloud-native Databases distributed-systems elasticity high-availability

Written by

Qingyun Technology Community

Official account of the Qingyun Technology Community, focusing on tech innovation, supporting developers, and sharing knowledge. Born to Learn and Share!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.