Cloud Computing 14 min read

TencentOS "Wujing": Server Memory Multi-Level Offloading Solution for Cloud Data Centers

TencentOS “Wujing” provides a server‑memory multi‑level offloading framework that uses kernel‑side reclamation, heat‑aware page classification, SWAP balancing, and CXL promotion to shift cold pages to cheaper storage, cutting data‑center memory use by up to 50 % while preserving performance.

Tencent Cloud Developer

May 31, 2023

TencentOS "Wujing": Server Memory Multi-Level Offloading Solution for Cloud Data Centers

As memory demands continue to grow in cloud data centers, TencentOS "Wujing" emerges as a server memory multi-level offloading solution that leverages OS kernel-side memory optimization advantages. This article explores the technical architecture and implementation of Wujing, which aims to reduce overall memory consumption while maintaining business performance by offloading colder memory pages to cheaper storage devices.

The solution addresses the critical challenge of high memory costs in data centers, where server hardware accounts for approximately 80% of total data center costs, with DRAM procurement being a major expense. Applications typically employ memory-intensive strategies to improve cache performance, while "data center tax" overhead causes servers to maintain numerous resident applications that occupy memory long-term.

Wujing implements a comprehensive architecture with multiple self-developed modules: UMRD (Userspace Memory Reclaim Daemon) for proactive asynchronous memory reclamation based on pressure information; DAMON core submodule for active memory heat detection providing tiered reclamation data sources; SWAP hinting framework for multi-level write balancing based on page temperature during pageout; SWAP balancer module for asynchronous balancing across multiple SWAP devices enabling precise cold memory sedimentation; CXL support utilizing kernel Promote/Demote framework to avoid Page Fault and IO overhead; and extensive performance optimizations for kernel memory management core code, Cgroup V1 PSI, SWAP paths, and Working Set statistics.

Benchmark results demonstrate that Wujing achieves comparable performance to Meta's TMO solution while offering superior reclamation strategies. In production deployments, memory usage can be reduced by an average of 30%, with some scenarios achieving over 50% memory savings. The solution supports various cost-saving scenarios including smooth configuration reduction, memory oversubscription, and adaptive load pressure regulation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Memory optimization Linux kernel cgroup cloud infrastructure DAMON memory tiering Operating System Kernel server performance SWAP Management tencentos

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.