Boosting Cluster Utilization with Alibaba's K8s Mixed Deployment and QoS Priorities
This article explains Alibaba's seven‑year experience with mixed deployment on Kubernetes, detailing how priority classes and QoS models are used to reclaim idle resources for low‑SLO workloads, improve overall cluster utilization, and maintain service‑level objectives for both online and offline pods.
Introduction
Since 2014 Alibaba has been developing an offline mixed‑deployment technique that has been validated through multiple Double‑11 events and is now deployed at massive scale across the group. The approach saves billions of yuan annually and raises overall cluster resource utilization to around 70 %.
The technology is packaged as a plug‑in that can be installed on any standard native Kubernetes cluster, providing mixed‑deployment control and operational capabilities to improve both resource usage and user experience.
Kubernetes Native Model
In many Kubernetes environments practitioners conflate Priority (scheduling order) with QoS (runtime resource guarantees). The native model separates these concepts: Priority determines which pod is considered first by the scheduler, while QoS classes (Guaranteed, Burstable, Best‑Effort) define the level of resource isolation at runtime.
Understanding this distinction is essential before introducing mixed‑deployment semantics.
Problems Addressed by Mixed Deployment
The primary goal is to maximise cluster utilisation while preserving the service‑level objectives (SLO) of all deployed applications. Online services are typically provisioned with peak resource specifications, leaving a large portion of the allocated CPU and memory idle. Mixed deployment oversells this idle capacity to low‑SLO offline jobs, requiring SLO‑aware scheduling and real‑time resource awareness to avoid hotspots.
When node‑level utilisation becomes high, offline jobs can be pre‑empted to protect online SLOs. This pre‑emption leverages kernel‑level cgroup isolation to enforce strict resource boundaries.
Application Level Model
Pods are classified into three custom QoS classes:
LSR – Low‑SLO‑Realtime
LS – Low‑SLO
BE – Best‑Effort (used for reclaimed resources)
The class is declared explicitly via pod annotations and labels, which are then mapped to both scheduling Priority and runtime QoS.
apiVersion: v1
kind: Pod
metadata:
annotations:
alibabacloud.com/qosClass: BE # {LSR, LS, BE}
labels:
alibabacloud.com/qos: BE # {LSR, LS, BE}
spec:
containers:
- resources:
limits:
alibabacloud.com/reclaimed-cpu: 1000 # milli‑core, 1000 = 1 core
alibabacloud.com/reclaimed-memory: 2048 # bytes (Gi, Mi, Ki, GB, MB, KB supported)
requests:
alibabacloud.com/reclaimed-cpu: 1000
alibabacloud.com/reclaimed-memory: 2048The BE class uses the extended‑resource mechanism ( alibabacloud.com/reclaimed‑cpu and alibabacloud.com/reclaimed‑memory) to request reclaimed capacity, while LSR and LS follow the standard CPU/memory fields.
These classes also influence network QoS, ensuring that low‑priority offline tasks do not monopolise bandwidth.
Scheduling Behavior
Both Priority and QoS classes affect the scheduler and the kubelet runtime. High‑SLO workloads (typically LSR or LS) receive higher scheduling priority and stronger QoS guarantees, while BE pods are scheduled later and can be pre‑empted when node pressure rises.
Quota, Waterline, and Multi‑Tenant Isolation
Beyond per‑pod priority, production deployments also enforce node‑level waterline thresholds, tenant‑specific quotas, and OS‑level isolation (cgroup, memory‑waterline, etc.) to guarantee SLOs across multiple tenants. These mechanisms are mentioned for completeness and will be detailed in future articles.
Related Solutions and References
Alibaba Cloud exposes the mixed‑deployment capabilities through the ACK Agile edition and the CloudNative Stack (CNStack) family, combined with the OpenAnolis operating system, forming an end‑to‑end cloud‑native data‑center solution.
Technical reference documents:
https://kubernetes.io/docs/concepts/scheduling-eviction/
https://kubernetes.io/docs/concepts/workloads/pods/disruptions/
https://kubernetes.io/docs/concepts/scheduling-eviction/node-pressure-eviction/
https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass
https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/
https://kubernetes.io/docs/tasks/configure-pod-container/extended-resource/
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
