Databases 16 min read

Architecture Upgrade Challenges and Atomic Write Solutions for Cloud-native Databases

Collaborating across TencentOS and database kernel teams, the article details how architecture upgrades—moving to TKE HouseKeeper, switching to AMD CPUs, and adding a portable 16 KB atomic‑write feature—combined with kernel optimizations like huge‑page support, NUMA‑aware qspinlocks, speculative page‑fault handling, and ORC unwinding to deliver up to 30 % mixed workload and over 100 % write‑only performance gains while reducing memory usage.

Tencent Cloud Developer
Tencent Cloud Developer
Tencent Cloud Developer
Architecture Upgrade Challenges and Atomic Write Solutions for Cloud-native Databases

This article presents the collaborative work of the TencentOS kernel team and the database kernel team to improve cloud-native database performance without changing the underlying architecture.

It first outlines the challenges of upgrading the architecture, including the shift from traditional physical machines to TKE HouseKeeper for resource decoupling, and the migration from Intel to AMD CPUs to increase single‑core performance.

The core technical contribution is the introduction of a 16 KB atomic‑write capability that matches AWS’s atomic‑write feature while remaining portable. By leveraging XFS COW and transparent huge pages, the solution reduces iTLB misses and eliminates double‑write overhead in MySQL’s double‑write buffer.

Extensive benchmark results show up to 30 % performance gains in mixed read/write workloads and more than 100 % improvement in write‑only scenarios on 4 vCPU 16 GB instances. Memory compression ("悟净") further reduces memory usage by 3‑11 % on small‑spec nodes.

Additional kernel optimizations are described:

Code‑segment huge‑page support reduces iTLB misses and improves QPS by 5‑8 %.

NUMA‑aware qspinlock redesign separates primary and secondary queues to keep lock hand‑off within the same NUMA node, yielding up to 7 % QPS improvement.

Speculative page‑fault handling (SPF) uses seqlocks and SRCU to avoid holding mmap_sem, decreasing page‑fault latency.

ORC unwinder replaces DWARF‑based unwinding, eliminating the need for a frame pointer and improving stack‑trace performance.

The article concludes that these mature techniques significantly boost the cost‑performance of cloud database services and will continue to evolve with future feature releases.

performancecloud nativeDatabaseatomic writeKernel OptimizationNUMAORC unwinder
Tencent Cloud Developer
Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.