Cloud Native 7 min read

Koordinator vs Crane: Which Scheduler Optimizes Kubernetes Resource Usage?

The article examines how native Kubernetes scheduling based solely on resource requests leads to waste and imbalance, compares the open‑source crane‑scheduler and koord‑scheduler architectures, explains practical configuration of Koordinator, and provides step‑by‑step testing procedures to achieve load‑aware scheduling.

Ops Development Stories

Nov 6, 2024

Koordinator vs Crane: Which Scheduler Optimizes Kubernetes Resource Usage?

Background

Native Kubernetes scheduler only considers resource Requests, which often differ greatly from actual usage in production, causing resource waste and load imbalance.

Open‑source solution comparison: crane‑scheduler vs koord‑scheduler

crane‑scheduler architecture

Precondition: Prometheus must be installed and data fetched from Prometheus.

koord‑scheduler architecture

Metrics are collected by koordlet, a DaemonSet plugin that stores data locally in Prometheus.

Comparison

Metric collection period : crane relies on external Prometheus (default 30 s, coarse); koordlet runs on each node with a local Prometheus (default 1 s).

Value types : crane provides avg and max; koord provides avg, p50, p90, p95, p99.

Offline mixing support : crane does not support; koord supports online Pods (LSE/LSR/LS) and offline Pods (BE).

hotValue resource estimation : both support.

Utilization denominator : crane uses host total resources (unreasonable); koord uses Node allocatable (reasonable).

Overall, koord‑scheduler is chosen.

Koordinator usage practice

Add UsageAggregatedDuration of 18 h:

kubectl -n koordinator-system edit  cm slo-controller-config

data:
  colocation-config: |
    {
      "enable": true,
      "metricAggregatePolicy": {
        "durations": [
          "5m",
          "10m",
          "30m",
          "18h"
        ]
      }
    }

Modify Prometheus storage duration in koordlet:

kubectl -n koordinator-system edit ds koordlet

containers:
      - args:
        - -addr=:9316
        - -cgroup-root-dir=/host-cgroup/
        - --logtostderr=true
        - --tsdb-retention-duration=18h

Use promtool inside the Pod to view data: ./promtool tsdb list /metric-data/

Update threshold trigger rules (requires restarting koord‑scheduler):

kubectl -n koordinator-system edit cm  koord-scheduler-config

aggregated:
              usageThresholds:
                cpu: 55
                memory: 85
              usageAggregationType: "p99"
              scoreAggregationType: "p99"
            estimatedScalingFactors:
              cpu: 85
              memory: 70

kubectl -n koordinator-system rollout restart deployment koord-scheduler

Since public‑cloud resources may have their own scheduler, only the IDC data‑center scheduler is modified, adding a mutating webhook for quick rollback if needed.

Activation method (label namespace):

kubectl label ns ${NsName} koordinator-injection=enabled

Rollback method:

kubectl label ns ${NsName} koordinator-injection-

Source code: https://github.com/koordinator-sh/koordinator

Customized code: https://github.com/clay-wangzhi/koordinator

Quick deployment of customized code:

git clone https://github.com/clay-wangzhi/koordinator
cd koordinator/manifests
kubectl apply -f setup/
kubectl apply -f koordlet/
kubectl apply -f koord-scheduler/
kubectl apply -f koord-manager/

Testing

1) Identify high‑load Nodes:

kubectl top node | sort -nk 3
kubectl get nodemetrics.slo.koordinator.sh

2) Label a high‑load Node and several normal Nodes: kubectl label node $(NodeName) test=true 3) Label the application namespace to enable the mutating webhook (sets SchedulerName to koord-scheduler):

kubectl label ns ${NsName} koordinator-injection=enabled

4) Add node affinity and pod anti‑affinity to an application, matching the number of replicas to the number of labeled Nodes:

spec:
  replicas: 4
  template:
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: test
                operator: In
                values:
                - "true"
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: appid
                operator: In
                values:
                - $(AppidName)
            topologyKey: kubernetes.io/hostname

5) Verify the result: a Pod should be in Pending with a reason containing the expected text, indicating successful configuration.

Reference links:

Crane‑Scheduler: Real‑world workload‑aware scheduler design and implementation – https://cloud.tencent.com/developer/article/2296515?areaId=106005

Koordinator load‑aware scheduling – https://koordinator.sh/zh-Hans/docs/user-manuals/load-aware-scheduling

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

cloud-native Kubernetes Scheduler Open source Koordinator Load-aware

Written by

Ops Development Stories

Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.