How to Extend the Kubernetes Scheduler with Custom Plugins and Network Traffic Scoring
This article provides a step‑by‑step guide on extending the Kubernetes scheduler, covering configuration of scheduler profiles, implementing out‑of‑tree plugins, integrating Prometheus‑based network traffic scoring, and deploying the custom scheduler both inside and outside a cluster, complete with code samples and troubleshooting tips.
Overview
This article explains how to extend the Kubernetes scheduler by using its extension points, the principles behind scheduler extensions, and finishes with an experiment that records network‑traffic‑based scheduling decisions.
Kubernetes Scheduler Configuration
Kubernetes allows multiple schedulers in a cluster and lets you assign a specific scheduler to a Pod. The default scheduler configuration file is specified with --config=. Different Kubernetes versions use different configuration API versions:
Before 1.21: v1beta1 1.22: v1beta2 (still supports v1beta1)
1.23‑1.25: v1beta3 (keeps v1beta2, removes v1beta1)
A simple KubeSchedulerConfiguration example:
apiVersion: kubescheduler.config.k8s.io/v1beta1
kind: KubeSchedulerConfiguration
clientConnection:
kubeconfig: /etc/srv/kubernetes/kube-scheduler/kubeconfigNotes: --kubeconfig and --config cannot be used together; specifying --config disables other parameters.
Scheduler Configuration Details
The configuration file can define multiple profiles, each with its own set of plugins. Example profiles for different versions are shown using v1beta1, v1beta2, and v1beta3.
How to Extend kube‑scheduler
Since Kubernetes 1.15 the scheduler framework introduces extensible plugins that can be registered out of tree. The framework redefines extension points as plugins, allowing custom logic to be injected without modifying the core scheduler source.
Define Entry Point
Use scheduler.NewSchedulerCommand with WithPlugin to register custom plugins:
import (
scheduler "k8s.io/kubernetes/cmd/kube-scheduler/app"
)
func main() {
command := scheduler.NewSchedulerCommand(
scheduler.WithPlugin("example-plugin1", ExamplePlugin1),
scheduler.WithPlugin("example-plugin2", ExamplePlugin2),
)
if err := command.Execute(); err != nil {
fmt.Fprintf(os.Stderr, "%v
", err)
os.Exit(1)
}
}The NewSchedulerCommand allows out‑of‑tree plugins to be injected without changing the scheduler source.
Plugin Implementation
Implement the required extension‑point interfaces. For example, the built‑in NodeAffinity plugin implements the Score interface:
type NodeAffinity struct {
handle framework.FrameworkHandle
}
func (pl *NodeAffinity) Score(ctx context.Context, state *framework.CycleState, pod *v1.Pod, nodeName string) (int64, *framework.Status) {
// scoring logic here
}Register the plugin with a Name method and provide ScoreExtensions and NormalizeScore implementations.
Experiment: Network‑Traffic‑Based Scheduling
The experiment creates a custom plugin NetworkTraffic that scores nodes based on network bandwidth collected from Prometheus.
Experiment Environment
A Kubernetes cluster with at least two nodes.
Prometheus node_exporter installed on the nodes.
Familiarity with PromQL and the Go client library.
Plugin Definition
const Name = "NetworkTraffic"
var _ = framework.ScorePlugin(&NetworkTraffic{})
type NetworkTraffic struct {
prometheus *PrometheusHandle
handle framework.FrameworkHandle
}The Score method queries Prometheus for the node’s network usage and returns the raw bandwidth as the score.
func (n *NetworkTraffic) Score(ctx context.Context, state *framework.CycleState, p *corev1.Pod, nodeName string) (int64, *framework.Status) {
nodeBandwidth, err := n.prometheus.GetGauge(nodeName)
if err != nil {
return 0, framework.NewStatus(framework.Error, fmt.Sprintf("error getting node bandwidth measure: %s", err))
}
return int64(nodeBandwidth.Value), nil
}The NormalizeScore method scales the raw bandwidth so that higher usage yields a lower final score:
func (n *NetworkTraffic) NormalizeScore(ctx context.Context, state *framework.CycleState, pod *corev1.Pod, scores framework.NodeScoreList) *framework.Status {
var maxScore int64
for _, node := range scores {
if maxScore < node.Score {
maxScore = node.Score
}
}
for i, node := range scores {
scores[i].Score = framework.MaxNodeScore - (node.Score*100/maxScore)
}
return nil
}Prometheus Handle
type PrometheusHandle struct {
deviceName string
timeRange time.Duration
ip string
client v1.API
}
func NewProme(ip, deviceName string, timeRange time.Duration) *PrometheusHandle {
client, err := api.NewClient(api.Config{Address: ip})
if err != nil {
klog.Fatalf("FatalError creating prometheus client: %s", err)
}
return &PrometheusHandle{deviceName: deviceName, ip: ip, timeRange: timeRange, client: v1.NewAPI(client)}
}
func (p *PrometheusHandle) GetGauge(node string) (*model.Sample, error) {
// query Prometheus using a PromQL expression
}Scheduler Configuration with Plugin Args
apiVersion: kubescheduler.config.k8s.io/v1beta1
kind: KubeSchedulerConfiguration
clientConnection:
kubeconfig: /mnt/d/src/go_work/customScheduler/scheduler.conf
profiles:
- schedulerName: custom-scheduler
plugins:
score:
enabled:
- name: "NetworkTraffic"
disabled:
- name: "*"
pluginConfig:
- name: "NetworkTraffic"
args:
ip: "http://10.0.0.4:9090"
deviceName: "eth0"
timeRange: 60Deployment
Build the custom scheduler as a static binary, package it into a Docker image, and deploy it with a ServiceAccount, ClusterRoleBinding, and Deployment in the kube-system namespace.
FROM golang:alpine AS builder
WORKDIR /scheduler
COPY ./ /scheduler
ENV GOPROXY https://goproxy.cn,direct
RUN apk add upx && \
GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build -ldflags "-s -w" -o scheduler main.go && \
upx -1 scheduler && chmod +x scheduler
FROM alpine AS runner
WORKDIR /go/scheduler
COPY --from=builder /scheduler/scheduler .
COPY --from=builder /scheduler/scheduler.yaml /etc/
VOLUME ["./scheduler"]Deploy the scheduler with a Deployment that runs the binary using the custom configuration file.
Verification
Create a Pod (or Deployment) that specifies schedulerName: custom-scheduler. In a two‑node cluster the custom plugin prefers the node with lower network bandwidth, and the scheduler logs show the bandwidth values and final scores.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
schedulerName: custom-schedulerRunning kubectl get pods -o wide shows both Pods scheduled on the node with lower network usage, confirming the plugin works as intended.
References
Scheduling config – https://kubernetes.io/docs/reference/scheduling/config/
Kube‑scheduler command reference – https://kubernetes.io/docs/reference/command-line-tools-reference/kube-scheduler/
Scheduling plugins – https://kubernetes.io/docs/reference/scheduling/config/#scheduling-plugins
Custom scheduler plugins – https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/624-scheduling-framework/README.md#custom-scheduler-plugins-out-of-tree
Issue #79384 – https://github.com/kubernetes/kubernetes/issues/79384
Scheduler performance tuning – https://kubernetes.io/zh-cn/docs/concepts/scheduling-eviction/scheduler-perf-tuning/
Creating a kube‑scheduler plugin – https://medium.com/@juliorenner123/k8s-creating-a-kube-scheduler-plugin-8a826c486a1
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
