Cloud Native 22 min read

How Koordinator Revolutionizes Cloud‑Native Mixed‑Workload Scheduling

Koordinator, an open‑source cloud‑native scheduler launched in April 2022, unifies heterogeneous workloads on Kubernetes through zero‑intrusion plugins, fine‑grained resource oversubscription, QoS‑aware scheduling, and a flexible descheduler framework, dramatically improving resource utilization and latency‑sensitive service performance.

Alibaba Cloud Native

Nov 4, 2022

How Koordinator Revolutionizes Cloud‑Native Mixed‑Workload Scheduling

Project Origin and Evolution

Since its first release in April 2022, Koordinator has undergone eight version iterations, attracting engineers from Alibaba, Xiaomi, Xiaohongshu, iQIYI, and 360. On November 3, 2022, at the Hangzhou Cloud Native Conference, Alibaba Cloud announced the official launch of Koordinator 1.0.

Motivation and Vision

Koordinator addresses the need to coordinate diverse Kubernetes workloads—batch, real‑time, AI, and big‑data jobs—so they can share nodes efficiently. It builds on the mixed‑workload concepts pioneered by Google’s Borg system and the widespread adoption of Kubernetes as the industry standard.

Key Challenges in Mixed‑Workload Deployment

Application Integration – How to onboard workloads onto a mixed‑workload platform.

Stable and Efficient Execution – How to keep applications running reliably and with high performance.

Design Principles

Koordinator delivers a zero‑intrusion solution:

No changes to native Kubernetes components; functionality is added via plugins.

No modifications to workload operators; policies are configured declaratively.

Seamless, loss‑less upgrade from the default scheduler to Koordinator.

Architecture Overview

The architecture provides complete mixed‑workload orchestration, resource scheduling, isolation, and performance tuning. It extracts idle resources from nodes and reallocates them to latency‑sensitive services, boosting overall utilization.

Large‑Scale Production Experience

Alibaba has run mixed‑workload scheduling at scale since 2016, achieving over 50% CPU utilization and cutting Double‑11 2022 compute costs dramatically. Koordinator inherits three generations of internal architecture and aims to provide a neutral, community‑driven solution for enterprises.

Supported Workload Types

Fine‑grained resource orchestration for performance‑critical and tail‑latency requirements.

Intelligent oversubscription that runs low‑priority tasks on allocated but unused resources.

The oversubscription model visualizes allocated, used, and oversellable resources, enabling batch (MapReduce‑like) and real‑time jobs to coexist.

Zero‑Intrusion Integration Details

Plugin‑based extensions keep the Kubernetes core untouched.

Configuration‑driven policies avoid changes to operators.

Gradual scheduler upgrade preserves existing Pod scheduling semantics.

Feature Deep Dives

Enhanced Coscheduling

Supports strict and non‑strict modes and multi‑role AI jobs (e.g., TensorFlow with PS and Worker) to satisfy All‑or‑Nothing requirements.

Enhanced ElasticQuota Scheduling

Compatible with the community ElasticQuota CRD.

Tree‑structured quota management aligns with organizational hierarchies.

Shared‑weight fairness and optional quota borrowing.

Fine‑Grained Device Scheduling

GPU sharing, exclusive use, and oversubscription.

Percentage‑based GPU allocation.

Multi‑GPU scheduling and NVLink‑aware placement (in progress).

Differentiated SLOs

Implements Priority & QoS tiers, CPU Suppress, CPU Burst, memory‑threshold eviction, and CPU‑satisfaction‑based eviction to protect latency‑sensitive services while allowing best‑effort workloads to utilize spare capacity.

Resource Isolation with RDT

Uses Intel Resource Director Technology to partition L3 cache and memory bandwidth (MBA), preventing noisy‑neighbor interference across mixed workloads.

QoS‑Aware Scheduling and Rescheduling

Load‑aware scheduling filters overloaded nodes, while the descheduler framework offers plug‑in‑based custom strategies, safe pod migration via the PodMigrationJob CRD, and reservation‑first eviction to avoid resource starvation.

Reservation API

Allows pre‑allocation of resources without modifying Pod specs, supporting future workload needs, PaaS scaling, rolling updates, fragment consolidation, and safe rescheduling.

Future Roadmap

The community plans to extend mixed‑workload support to Hadoop YARN and other big‑data frameworks, improve interference detection, and continue standardizing mixed‑workload capabilities across vendors.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cloud Native kubernetes Scheduler open source Mixed Workload

Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.