Cloud Native 19 min read

Mixed-Workload Scheduling and Resource Utilization Optimization in Xiaohongshu's Cloud-Native Platform

Xiaohongshu’s cloud‑native platform adopted a four‑stage mixed‑workload scheduling strategy—reusing idle nodes, whole‑machine time‑sharing, normal mixed pools, and a unified scheduler (Tusker) that coordinates CPU, GPU and memory across Kubernetes and YARN—boosting average cluster CPU utilization from under 20 % to over 45 % and delivering millions of low‑cost core‑hours while preserving QoS for latency‑sensitive, mid, and batch jobs.

Xiaohongshu Tech REDtech

Nov 27, 2023

Mixed-Workload Scheduling and Resource Utilization Optimization in Xiaohongshu's Cloud-Native Platform

According to Gartner, global IT spending in 2024 is expected to reach $5.1 trillion, an 8 % increase over 2023, while average CPU utilization in data‑center servers remains below 20 %.

Google’s 2015 Borg paper first demonstrated mixed‑workload (mixed‑tenant) scheduling to improve utilization; many Chinese internet companies have followed suit.

At Xiaohongshu, rapid business growth has led to low daily CPU utilization in many clusters. Main causes: tidal usage patterns, fragmented exclusive resource pools, and over‑provisioning for stability.

Since 2022 the Xiaohongshu container team has deployed mixed‑workload techniques at scale, raising average cluster CPU utilization to over 45 % and delivering millions of core‑hours of cost‑effective compute.

Technical evolution is described in four stages:

Stage 1 – Reuse of idle resources : Consolidate idle nodes from exclusive pools and allocate them to transcoding workloads via a “metadata” cluster that aggregates resources using Virtual‑Kubelet.

Stage 2 – Whole‑machine time‑sharing : During low‑traffic periods, scale‑down online services with HPA, release whole machines, and run offline jobs (transcoding, training) on the freed capacity.

Stage 3 – Normal mixed‑workload : Merge online and offline services into a shared pool, applying pool‑level over‑commit, resource‑oversell, and fine‑grained scheduling to raise CPU allocation rates.

Stage 4 – Unified scheduling : Introduce a unified scheduler that coordinates heterogeneous resources (CPU, GPU, memory) across K8s and YARN, supporting QoS‑aware placement, interference detection, and resource‑selling models.

The unified scheduler, named Tusker (T Unified Scheduling system base on Kubernetes for Efficiency and Reliability), receives workload submissions from multiple publishing platforms, performs QoS‑aware dispatch, and integrates with a “resource view” that aggregates offline‑available capacity.

Offline resource view is calculated as:

OfflineAvailable = TotalMachineResources – ReservedResources – OnlineServiceActualUsage

QoS levels are defined as:

Latency‑Sensitive (highest QoS, e.g., search promotion)

Mid (default QoS, tolerant of some interference, e.g., Java micro‑services)

Batch (lowest QoS, non‑latency‑critical, e.g., batch transcoding, Spark/Flink jobs)

Scheduling strategies include:

Dynamic oversell to expose spare capacity to offline jobs.

Two‑scheduler resource synchronization (K8s ↔ YARN) via Koord‑Yarn‑Operator.

Offline eviction based on priority, resource consumption, and runtime when resources become scarce.

Operational results to date:

Average CPU utilization of mixed clusters exceeds 45 % (some clusters reach 55 %).

Online clusters see 8‑15 % uplift; storage clusters up to 20 %.

CPU allocation rates above 125 %, dramatically reducing fragmentation.

Provision of millions of core‑hours of low‑cost compute for offline workloads.

Future work focuses on expanding mixed‑workload scheduling for big‑data and AI tasks, further improving resource efficiency through larger resource pools and quota‑based delivery, and strengthening QoS‑aware and interference‑detection capabilities.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data cloud-native Kubernetes Resource Scheduling QoS cpu-utilization mixed workloads

Written by

Xiaohongshu Tech REDtech

Official account of the Xiaohongshu tech team, sharing tech innovations and problem insights, advancing together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.