Cloud Native 16 min read

How Alibaba Cloud’s Differential SLO Boosts Kubernetes Resource Utilization

This article explains Alibaba Cloud Container Service for Kubernetes's differential SLO approach, detailing the reclaimed‑resource model, CPU burst and topology‑aware scheduling, kernel group identity, memory watermark tiering, and real‑world case studies that demonstrate significant improvements in cluster efficiency and latency‑sensitive workload performance.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
How Alibaba Cloud’s Differential SLO Boosts Kubernetes Resource Utilization

Background

Alibaba Cloud has accumulated years of experience with "differentiated SLO mixed deployment" and now offers a leading‑edge solution that runs heterogeneous workloads—latency‑sensitive (LS) and best‑effort (BE)—on the same node, exploiting their distinct resource‑SLO characteristics to improve overall cluster utilization.

Resource Model

The reclaimed‑resource model defines three zones on a node: usage (actual consumption), buffered (reserved portion), and reclaimed (excess that can be over‑committed). The reclaimed amount equals the sum of reclaimed resources from Guaranteed/Burstable pods.

# Node
status:
  allocatable:
    # milli‑core
    alibabacloud.com/reclaimed‑cpu: 50000
    # bytes
    alibabacloud.com/reclaimed‑memory: 50000
  capacity:
    alibabacloud.com/reclaimed‑cpu: 50000
    alibabacloud.com/reclaimed‑memory: 100000

ACK exposes these reclaimed metrics as standard extended resources in the node’s Node status.

Using Reclaimed Resources in Pods

Low‑priority BE pods can request reclaimed resources by adding the alibabacloud.com/qos label (BE or LS) and specifying alibabacloud.com/reclaimed‑cpu and alibabacloud.com/reclaimed‑memory in their resources section.

# Pod
metadata:
  labels:
    alibabacloud.com/qos: BE # {BE, LS}
spec:
  containers:
  - resources:
      limits:
        alibabacloud.com/reclaimed‑cpu: 1000
        alibabacloud.com/reclaimed‑memory: 2048
      requests:
        alibabacloud.com/reclaimed‑cpu: 1000
        alibabacloud.com/reclaimed‑memory: 2048

Technical Details

CPU Burst

Kubernetes limits enforce a time‑slice per 100 ms period. When a container’s CPU limit is 2 cores, the kernel caps its usage to 200 ms per period, causing throttling and latency spikes for LS workloads. CPU Burst lets containers accumulate idle time‑slices and spend them during bursts, reducing tail latency. ACK fully supports CPU Burst and, on kernels without native support, emulates the behavior by monitoring throttling and dynamically adjusting limits.

CPU Burst illustration
CPU Burst illustration

CPU Topology‑Aware Scheduling

High pod density on modern multi‑core nodes leads to CPU contention and NUMA effects. The static policy only works for Guaranteed QoS pods and applies cluster‑wide, lacking fine‑grained control. ACK implements a scheduling framework‑based topology‑aware scheduler that supports all QoS classes, enables per‑pod core pinning, and selects the optimal node‑CPU topology across the cluster.

Elastic Resource Limits (Reclaimed‑Resource)

The reclaimed‑resource pool varies dynamically with LS pod usage. BE pods consume reclaimed CPU only when LS pods leave sufficient headroom; otherwise, their effective CPU share shrinks.

Reclaimed resource dynamics
Reclaimed resource dynamics

Kernel Group Identity

Starting with kernel‑4.19.91‑24.al7, Alibaba Cloud Linux introduces Group Identity, adding a second red‑black tree for low‑priority tasks. This separates scheduling of high‑ and low‑priority tasks, minimizing wake‑up latency for high‑priority workloads and preventing low‑priority tasks from affecting them, even under SMT.

Group Identity diagram
Group Identity diagram

LLC and MBA Isolation

On bare‑metal nodes, ACK can dynamically adjust Last‑Level Cache (LLC) and Memory Bandwidth Allocation (MBA) for BE pods, reducing interference with LS pods.

Global Memory Watermark Tiering

When BE tasks suddenly allocate large memory, the system may hit the global wmark_min, triggering direct memory reclamation and hurting LS latency. Alibaba Cloud Linux adds a tiered global wmark_min: BE’s watermark is raised (earlier reclamation) while LS’s is lowered (delayed reclamation), preventing LS from entering the slow reclamation path.

Asynchronous Background Reclamation

ACK introduces a container‑level asynchronous reclamation mechanism using a workqueue and the memory.wmark_ratio control file (available in both cgroup v1 and v2). When a container’s memory usage exceeds the ratio, the kernel performs proactive reclamation before synchronous reclamation would occur.

Async reclamation workflow
Async reclamation workflow

Case Studies

CPU Burst Performance

Using Apache HTTP Server as an LS workload, enabling CPU Burst on Alibaba Cloud Linux 2 reduced the 99th‑percentile response time (RT‑p99) compared with CentOS 7, eliminated CPU throttling, and kept overall pod utilization stable.

RT‑p99 improvement
RT‑p99 improvement
CPU throttling elimination
CPU throttling elimination

Mixed‑Workload Resource Efficiency

In a "Web + Big Data" scenario, nginx (LS) and Spark benchmark (BE) were co‑located on the same ACK node. Compared with non‑mixed baselines, the differential SLO suite kept nginx latency degradation under 5 % while increasing overall cluster CPU utilization from 49 % to 58 % and reducing Spark job total runtime by 8 %.

Mixed workload performance
Mixed workload performance

Conclusion

Alibaba Cloud Container Service for Kubernetes (ACK) now offers a suite of differential SLO features—reclaimed resources, CPU burst, topology‑aware scheduling, kernel group identity, memory watermark tiering, and asynchronous reclamation—that can be used independently or together. Real‑world experiments show up to 30 % higher cluster utilization and latency‑sensitive performance impact limited to less than 5 % in mixed deployments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeKubernetesResource ManagementAlibaba CloudACKCPU Burst
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.