Big Data 6 min read

Optimizing Flink Task Scheduling on a Kubernetes Standalone Cluster for Balanced Resource Utilization

This article analyzes the uneven task distribution problem in a Flink job running on a Kubernetes standalone cluster with 35 TaskManagers and 140 slots, proposes slot‑sharing‑group prioritization and delayed scheduling strategies, and demonstrates how these optimizations achieve more balanced CPU load and reduced data backlog.

Big Data Technology & Architecture

May 30, 2023

Optimizing Flink Task Scheduling on a Kubernetes Standalone Cluster for Balanced Resource Utilization

Background : The Flink job is deployed on a Kubernetes standalone cluster where the Flink cluster is first launched in containers and then jobs are submitted. Task submission and TaskManager registration happen concurrently.

Problem : With 35 TaskManagers providing 140 slots, a vertex whose parallelism is less than 140 leads to uneven task placement. For example, one Vertex’s tasks are concentrated on a few TaskManagers, causing load imbalance. The issue persists even when cluster.evenly-spread-out-slots=true is set.

Observed Topology : The job contains five vertices; two have parallelism 140, the others have parallelism 10, 30, and 35 respectively. The maximum parallelism is 140, and the cluster is configured with 35 TaskManagers each offering 4 cores and 8 GB.

Optimization Analysis : The problem can be simplified to a topology such as Vertex A(p=2) → Vertex B(p=4) → Vertex C(p=2). Using slot sharing and local data transfer preferences, the topology is divided into four ExecutionSlotSharingGroups: {A1,B1,C1}, {A2,B2,C2}, {B3}, {B4}. If each TaskManager is split into two slots, the allocation may become unbalanced, causing a bottleneck on the TaskManager that hosts the heavier tasks.

Proposed Optimizations :

When requesting slots for an ExecutionSlotSharingGroup, sort groups by the number of contained tasks and schedule groups with more tasks first.

Delay task scheduling until enough TaskManagers are registered so that the groups can be evenly distributed before slot acquisition.

Implementation snippets:

1. 为ExecutionSlotSharingGroup申请slot时先对其按包含Task个数排序，优先调度Task个数多的分组

2. 延缓任务调度，等注册TaskManager个数足够大ExecutionSlotSharingGroup平均分配再为其申请Slot

Effect : After applying the optimizations, tasks belonging to the same vertex are evenly scheduled across different TaskManagers.

优化后task调度情况：同个vertex的多个task均匀调度到不同的taskmanager节点上

Performance Comparison :

CPU Load – Before optimization: some nodes stay at 100 % for long periods; After optimization: CPU load is more evenly distributed and no node remains at sustained 100 %.

Data Backlog – The backlog after optimization is roughly half of the original, leading to higher throughput and lower latency.

Further Considerations :

Task Balancing – For a topology like Vertex A(p=3) → Vertex B(p=4) → Vertex C(p=1), an initial balanced grouping such as {A1,B1}, {A3,B3}, {A2,B2}, {B4,C1} can mitigate cross‑node communication overhead.

Delayed Scheduling Improvement – Incorporate delay strategies during Flink’s execution plan generation to reduce perceived latency for users.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Resource Balancing Task scheduling slot sharing

Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.