Big Data 19 min read

Mixed Workload Co-location of Big Data and Online Services at iQIYI: Design, Implementation, and Results

iQIYI’s mixed‑workload system colocates Spark/Hive big‑data jobs with online video services by running YARN NodeManagers inside Kubernetes, using an Elastic YARN Operator, Koordinator‑driven CPU oversubscription, and remote shuffle, boosting online CPU utilization from ~9 % to over 40 % and saving tens of millions of RMB annually.

iQIYI Technical Product Team
iQIYI Technical Product Team
iQIYI Technical Product Team
Mixed Workload Co-location of Big Data and Online Services at iQIYI: Design, Implementation, and Results

iQIYI adopts mixed workload (混部) as a cost‑effective way to improve resource utilization by running big‑data offline/real‑time jobs together with online video services. The article describes the end‑to‑end practice of building a mixed‑workload system, using big‑data as the primary example.

Background : iQIYI’s big‑data platform (Spark, Hive) supports critical business scenarios such as recommendation, search, and advertising. Offline jobs peak between 0‑8 am, causing resource shortage, while daytime resources remain idle. Online services show the opposite pattern (high usage during daytime). Traditional static CPU oversubscription on the previous container platform caused resource contention and unstable service quality.

Mixing Strategy : The team evaluated two approaches: (1) run big‑data jobs directly on Kubernetes, (2) run YARN NodeManager (NM) inside Kubernetes pods while keeping YARN scheduling. They chose option 2 because YARN already provides mature multi‑tenant scheduling, security, and supports all major big‑data frameworks.

Key Technical Components :

Containerized YARN NM pods managed by an Elastic YARN Operator for elastic node lifecycle.

Node Labels to separate elastic and fixed resources, ensuring real‑time Flink jobs run on stable nodes.

Graceful decommission of NM pods combined with Spark‑specific decommission handling (SPARK‑20624).

Remote shuffle service (Apache Uniffle) to decouple shuffle data from short‑lived NM pods.

Koordinator (and its Koordlet) for dynamic CPU oversubscription, providing per‑node CPU‑suppress metrics (1 s, 1 min, 5 min, 10 min averages) via Prometheus.

Dynamic scaling logic in YARN Operator that adjusts NM resources based on observed offline CPU availability, with guarder sidecar handling graceful container termination.

Evolution Stages :

Stage 1 – Night‑time time‑sharing : Elastic NM pods run only from 0‑9 am, providing 20+ improvements such as fixed IP pool, Elastic YARN Operator, node‑label isolation, graceful decommission, and Uniffle integration.

Stage 2 – Resource oversubscription : Koordinator allocates extra CPU (batch‑cpu) and memory to elastic NMs, with CPU‑suppress thresholds driving dynamic eviction.

Stage 3 – 24 h real‑time elasticity : NM pods dynamically sense CPU‑suppress metrics and scale vertically; guarder sidecar performs graceful kill when resources become scarce.

Stage 4 – Higher oversubscription ratios : Memory oversubscription is introduced, leading to additional eviction handling and further CPU utilization gains.

Results : The mixed‑workload solution raised online service CPU utilization from ~9 % to over 40 % without additional hardware, saving tens of millions of RMB annually. It also enabled elastic OLAP (Impala/Trino) on Kubernetes, supporting peak events such as holidays and major advertising campaigns.

Future Plans : Improve observability, increase mixed‑workload coverage, extend to multi‑cloud environments, and explore native Kubernetes batch schedulers for big‑data jobs.

cloud nativeBig DataKubernetesResource SchedulingYARNMixed Workload
iQIYI Technical Product Team
Written by

iQIYI Technical Product Team

The technical product team of iQIYI

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.