TencentOS Server: Cloud‑Native OS Architecture and Scheduling Innovations
This article introduces TencentOS Server, a cloud‑native Linux distribution developed by Tencent, detailing its general OS architecture, IaaS usage, cloud‑native scheduling (TCNS, BT, VMF, ECFS), resource QoS (RUE), quality monitoring, service‑level indicators, and integration with Kubernetes to enhance performance, isolation, and resource utilization.
Jiang Biao, a senior engineer at Tencent Cloud with over ten years of experience in operating systems and a Linux kernel enthusiast, leads the development of Tencent Cloud Native OS and works on OS/virtualization performance optimization.
TencentOS Server (also known as Tencent Linux or Tlinux) is a Linux‑based operating system designed for cloud scenarios, offering specialized features and performance optimizations that provide a high‑performance, secure, and reliable runtime environment for applications on cloud server instances. It is free to use, compatible with CentOS, and receives continuous updates and technical support from Tencent Cloud.
Having undergone more than ten years of iteration within Tencent, TencentOS now supports all of Tencent's business workloads, with over 3 million commercial deployment nodes that have proven its robustness under extreme and complex workloads.
General OS Architecture
The traditional OS is defined as the interface between applications and hardware, aiming to be convenient, efficient, and extensible. Its architecture typically consists of two major parts: the kernel, which abstracts hardware and provides services via system calls, and user‑space libraries and services that create the runtime environment for applications.
OS in IaaS Scenarios
In IaaS, the OS provides the execution environment for virtual machines. Typical workloads include VM‑related threads (Qemu + Vcpu), various control‑plane agents, and essential OS control threads (e.g., per‑CPU workers). To make VM performance approach or exceed that of bare metal, the OS adopts a “thin” approach, reducing virtualization and OS overhead through CPU pinning, memory pre‑allocation, and I/O bypass techniques.
The ultimate trend in this scenario is that the OS becomes increasingly thin, potentially disappearing.
Cloud‑Native Perspective on OS
With the rise of cloud‑native workloads, the OS is re‑examined from a new angle. It now supports diverse workloads such as containers, functions, and sandboxes, moving the application‑system boundary upward—everything below the application is the OS. Consequently, the OS becomes “thicker,” offering new possibilities.
TencentOS for Cloud‑Native
To meet the challenges of containerization, micro‑services, and serverless architectures, TencentOS has been extensively refactored to embrace cloud‑native principles. Its kernel layer implements several cloud‑native features:
Tencent Cloud Native Scheduler (TCNS)
Resource Utilization Enhancement (RUE)
Quality Monitor
Cloud‑Native SLI
Cgroupfs
Tencent Cloud Native Scheduler (TCNS)
TCNS is a comprehensive kernel scheduling solution for cloud‑native scenarios, covering containers, secure containers, and general workloads. It addresses CPU isolation in mixed‑priority environments and provides real‑time guarantees for latency‑sensitive services.
TCNS consists of three modules: BT Scheduler, VMF Scheduler, and ECFS.
BT Scheduler
Designed for mixed‑workload CPU isolation, BT Scheduler introduces a new scheduling class with lower priority than the default CFS, running only when no higher‑priority tasks are runnable. It enables absolute preemption of online workloads by offline tasks, achieving near‑perfect CPU isolation in mixed environments.
Key features include a full‑featured scheduling class comparable to CFS, support for offline task execution, and the ability to implement many custom scheduling behaviors.
VMF Scheduler
VMF (VM First) Scheduler targets secure‑container and virtual‑machine scenarios where the standard CFS cannot guarantee real‑time performance. It adopts an “unfair” scheduling policy that biases CPU resources toward VM processes, classifies tasks by type rather than fine‑grained priority, and implements aggressive one‑way preemption to ensure VM threads receive timely scheduling without compromising overall throughput.
Benefits include microsecond‑level scheduling latency, a lightweight codebase (less than one‑third of CFS), and robust real‑time guarantees for VM threads.
ECFS Scheduler
ECFS optimizes the upstream CFS for general workloads by introducing new task types to distinguish online and offline tasks, improving preemption logic, providing absolute preemption, and mitigating hyper‑threading interference.
Cloud‑Native Resource QoS – RUE
RUE (Resource Utilization Enhancement) is a TencentOS product designed to improve resource QoS in cloud‑native environments, increasing utilization while reducing operational costs. It introduces a global Pod priority concept that spans CPU, memory, I/O, and network resources.
Key modules include:
Cgroup Priority – unified Pod priority across all resource stacks.
CPU QoS – absolute preemption and isolation via TCNS.
Memory QoS – priority‑aware memory allocation, ensuring high‑priority containers receive timely memory while sacrificing low‑priority containers when necessary.
IO QoS – priority‑based I/O bandwidth allocation with guarantees for high‑priority workloads.
Net QoS – priority‑based network bandwidth allocation with minimum bandwidth guarantees.
Quality Monitor
Quality Monitor evaluates container service quality (QoS) and provides low‑overhead, event‑driven monitoring. It offers two components:
Score – calculates per‑priority and per‑cgroup QoS scores based on stall time caused by interference.
Monitor Buffer – captures contextual information when QoS thresholds are breached.
It enables fine‑grained QoS scoring and rapid diagnosis of performance degradation.
Cloud‑Native SLI
SLI (Service Level Indicator) metrics are collected directly in the kernel for CPU, memory, I/O, and network dimensions, providing container‑level visibility for latency, throughput, error rates, and other critical performance indicators.
Cgroupfs
Cgroupfs is a kernel‑space virtual file system that presents container‑specific views of /proc and /sys, overcoming the limitations of user‑space solutions like lxcfs. It offers accurate container‑level statistics for common tools (top, free, iotop, vmstat) and is designed for Cgroup v2.
TencentOS for Kubernetes
By aligning Kubernetes Service QoS Class with TencentOS priority, the kernel natively perceives priority, delivering strong isolation across the cgroup subsystem and improving mixed‑workload resource utilization.
The feature is open‑sourced in the TKE distribution and will be available in upcoming releases.
Conclusion
TencentOS continues to explore its cloud‑native journey, with ongoing innovations in scheduling, QoS, monitoring, and container‑aware file systems to meet the evolving demands of modern cloud workloads.
High Availability Architecture
Official account for High Availability Architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.