Cloud Native 38 min read

Developing a Custom Kubernetes Controller for Flink Task Scheduling

This article provides a step‑by‑step guide to building a custom Kubernetes controller in Go that uses Prometheus metrics to intelligently schedule Flink TaskManager Pods, covering the underlying scheduler concepts, code implementation, Docker image creation, RBAC setup, deployment, testing, and advanced considerations.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Developing a Custom Kubernetes Controller for Flink Task Scheduling

The author, an experienced engineer, introduces the challenge of migrating thousands of Flink tasks from YARN to Kubernetes and the resulting resource‑allocation bottlenecks, motivating the need for a custom scheduler that can place TaskManager Pods on optimal nodes.

After reviewing the core functions of the Kubernetes scheduler—pod list retrieval, node filtering, scoring, and binding—the article explains how the scheduler framework can be extended with plugins. A custom plugin named custom is created in Go, implementing the FilterPlugin and ScorePlugin interfaces.

Key code snippets include:

$ mkdir custom-scheduler && cd custom-scheduler && go mod init custom-scheduler

go.mod declares dependencies on Kubernetes 1.23.17 libraries and Prometheus client packages.

cmd/main.go launches the scheduler with the custom plugin:

package main import ( "flag" "os" "k8s.io/klog/v2" "k8s.io/kubernetes/cmd/kube-scheduler/app" plugin "custom-scheduler/pkg" ) func main() { klog.InitFlags(nil) flag.Parse() defer klog.Flush() command := app.NewSchedulerCommand(app.WithPlugin(plugin.PluginName, plugin.New)) if err := command.Execute(); err != nil { klog.Errorf("Error executing scheduler command: %v", err) os.Exit(1) } }

pkg/scheduler.go implements the plugin logic, querying Prometheus for real‑time CPU usage and converting the result into a score (lower CPU usage yields a higher score):

func (cs *CustomScheduler) Score(ctx context.Context, state *framework.CycleState, pod *corev1.Pod, nodeName string) (int64, *framework.Status) { query := fmt.Sprintf("100 - (avg by(instance) (rate(node_cpu_seconds_total{mode=\"idle\",instance=\"%s\"}[5m])) * 100)", nodeName) result, err := cs.prometheusClient.Query(ctx, query, time.Now()) if err != nil { klog.Errorf("Failed to query Prometheus: %v", err) return 0, framework.NewStatus(framework.Error, err.Error()) } var cpuUsage float64 if vector, ok := result.(model.Vector); ok && len(vector) > 0 { cpuUsage = float64(vector[0].Value) } score := int64((100 - cpuUsage) * 100 / 100) klog.Infof("Node %s CPU usage: %.2f%%, Score: %d", nodeName, cpuUsage, score) return score, framework.NewStatus(framework.Success, "") }

The Prometheus client wrapper ( pkg/prometheus.go ) provides a simple Query method using the official Go client.

To build the binary, a multi‑stage Dockerfile is used, producing a minimal Alpine image that runs ./custom-scheduler with a custom configuration file.

RBAC resources ( rbac.yaml ) grant the scheduler a cluster-admin service account, while configmap.yaml defines a KubeSchedulerConfiguration that registers the custom plugin under the scheduler name custom-scheduler . The deployment.yaml runs the scheduler pod in the kube-system namespace.

Verification steps include creating a test pod that specifies schedulerName: custom-scheduler , checking the pod’s node assignment, and reviewing scheduler logs for scoring details.

Finally, the article presents a series of reflective questions on extensibility, performance, high‑availability, security, and monitoring, encouraging readers to evolve the scheduler with additional metrics, distributed designs, caching, and machine‑learning‑driven decisions.

Cloud NativeFlinkKubernetesGoPrometheuscustom-scheduler
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.