How Mixed Workloads Boost Kubernetes CPU Utilization by Over 40%
This article explains how Youzan transformed its Kubernetes clusters from static over‑commit scheduling to load‑balanced mixed workloads using Koordinator and the Longxi kernel, achieving higher CPU utilization, lower costs, and better resource management for both online and offline services.
Background
As Youzan's business rapidly grows, demand for compute resources increases, putting pressure on cost control and supply. Meanwhile overall cluster resource utilization remains low, indicating room for improvement.
Main reasons:
Online services concentrate during daytime, leading to low CPU usage at night.
Offline jobs run at night, causing high CPU usage then and idle during day.
Burst traffic is over‑provisioned, reserving extra resources and lowering average CPU usage.
Static scheduling creates uneven node water‑levels, preventing further CPU gains.
With cloud‑native adoption, Kubernetes community introduced many mixed‑workload projects. Resource management shifted from static over‑commit to load‑balanced scheduling plus offline mixing. Since 2022, Youzan’s mixed‑workload clusters achieve about 40% average CPU utilization.
Solution Design
Based on Koordinator and the Longxi Linux kernel mixed‑workload architecture:
Scheduling layer:
Load‑balanced scheduling ensures new Pods land on idle nodes.
Rescheduling continuously evens node water‑levels as traffic changes.
Big‑data tasks (Spark ThriftServer) actively sense offline resource counts to avoid over‑use.
QoS guarantees:
Separate resource pools for online (LS) and offline (BE) levels.
Online auto‑scaling; offline timed eviction controller releases resources.
Offline CPU satisfaction eviction ensures task quality; memory pressure eviction protects nodes.
CPU throttling safeguards online service quality.
Asynchronous container memory reclamation with protection thresholds.
Cluster resource assurance:
Load‑based node scaling.
Longxi kernel asynchronous memory reclamation.
Strategy Evolution
Mixed workloads for online and offline services have progressed through node time‑sharing, load‑balanced scheduling, and steady‑state mixing.
Node Time‑Sharing
Cluster A mounts nodes from Cluster B via VK; controlling node and VK scheduling states drives the policy.
Periodic eviction of Pods on shared nodes and adjusting scheduling labels reuses nodes, improving offline resource utilization.
Load‑Balanced Scheduling
Using Koordinator’s load‑aware scheduling and precise application profiling, the online cluster shifts from static to dynamic water‑level‑based scheduling, raising average CPU water‑level from ~10% to ~25%.
Scheduling
Koordinator’s load‑aware and rescheduling capabilities move the cluster from static to node‑real‑load scheduling, unlike native Kubernetes which bases decisions on allocated resources. By considering historical load and estimating new Pods, it places Pods on less loaded nodes, achieving balanced node water‑levels and avoiding bottleneck nodes.
Rescheduling
Kubernetes may need to move running Pods due to uneven workload distribution, low overall utilization, or resource fragmentation.
Hotspot nodes overload affecting performance.
Desire to shut down under‑utilized nodes to cut costs.
Fragmentation prevents large Pods from scheduling despite sufficient total resources.
Application Profiling
After load‑balanced and rescheduling, node water‑level gaps improve, but peak loads can still create hotspots. Profiling identifies high‑load applications and pre‑emptively spreads them across nodes, mitigating hotspot formation.
Steady‑State Mixing
To reduce resource fragmentation and holding costs, workloads from dedicated Kubernetes clusters are merged into a large mixed‑workload cluster.
Offline Resource Pool Awareness
Big‑data task scheduler is refactored to dynamically sense Koordinator BE resource pool, controlling offline Pod creation scale.
Eviction
Active and passive eviction manage resource levels across time, ensuring service quality during peak periods.
Active Eviction
Online services auto‑scale; during low traffic, memory is released based on usage profiles.
Offline services use timed eviction controllers to remove idle Pods, freeing resources for online workloads.
Passive Eviction
In extreme cases (high memory usage, OOM risk, or sustained offline CPU shortage), offline eviction policies prioritize based on defined priority, resource consumption, and runtime, evicting services to optimize utilization.
Offline CPU satisfaction eviction moves tasks to nodes with sufficient resources.
Memory pressure eviction protects nodes from OOM.
Results
To date, Youzan’s mixed‑workload capability covers major business clusters, handling both online and offline scenarios. Large‑scale container mixing has yielded significant gains:
CPU utilization: online mixed clusters achieve daily average CPU utilization above 40% while maintaining service quality.
Resource cost: mixed clusters reduce costs by roughly 20% without compromising offline stability.
Event cost savings: during shopping festivals, online workloads temporarily use offline resources, avoiding extra scaling expenses.
Youzan Coder
Official Youzan tech channel, delivering technical insights and occasional daily updates from the Youzan tech team.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
