Fundamentals 9 min read

Understanding CPU Load Balancing and Scheduler Domains in Linux

This article explains the concept of CPU load balancing, the hierarchical scheduler domain and group structures in a multi‑core SoC, when and how the Linux kernel performs periodic, no‑hz, and idle load‑balancing, and outlines the step‑by‑step algorithm used to migrate tasks for balanced system performance.

Refining Core Development Skills

Mar 27, 2022

Understanding CPU Load Balancing and Scheduler Domains in Linux

Load balancing aims to reduce interference between CPUs by moving tasks from heavily loaded CPUs to lighter ones, ensuring each CPU’s task queue stays balanced.

Before discussing load balancing, it is essential to understand the CPU topology on a System‑on‑Chip (SoC), which is described using scheduling domains that represent hierarchical relationships among CPUs.

In a multi‑core SoC, clusters of cores share resources such as L2 cache; each cluster forms a multi‑core (MC) scheduling domain, while the entire chip forms a higher‑level DIE scheduling domain. Balancing across clusters requires flushing the L2 cache and incurs higher overhead.

CPU scheduling domains and groups can be inspected via the device‑model file /proc/sys/kernel/sched_domain.

Sched_domain members

Member

Description

parent and child

Define the hierarchical parent‑child relationship of scheduling domains; base domains have NULL child, top domains have NULL parent.

groups

Form a circular linked list of scheduling groups; this member points to the list head.

min_interval and max_interval

Specify the range of time intervals for checking the domain’s balance status.

balance_interval

Defines the interval at which the domain performs balancing.

busy_factor

When a CPU is busy, the balancing interval is multiplied by this factor.

imbalance_pct

Water‑mark that triggers balancing when the domain’s imbalance exceeds this percentage.

level

Indicates the domain’s level within the overall hierarchy.

span_weight

Number of CPUs contained in the domain.

span

Represents the domain’s span.

Sched_group members

Member

Description

Points to the next group in the circular linked list of groups within the domain.

group_weight

Number of CPUs in the group.

sgc

Computational capacity information of the group.

cpumask

Mask indicating which CPUs belong to the group.

CPU topology can be examined via /sys/devices/system/cpu/cpuX/topology, showing MC and DIE domains and their constituent CPUs.

The load‑balancing software architecture consists of two main tracking components: CPU load tracking, which aggregates load across clusters, and task load tracking, which evaluates whether a task fits the current CPU’s capacity and decides how many tasks to migrate.

CPU load tracking: aggregates load per cluster to detect inter‑cluster imbalance.

Task load tracking: determines task suitability for a CPU and selects tasks for migration.

Balancing is triggered by scheduling events such as task wake‑up, task creation, or tick interrupts, prompting the kernel to assess imbalance and possibly migrate tasks.

Linux’s CFS scheduler provides three types of load balancers:

Periodic load balancer : runs on each tick, checks the system’s balance, and moves runnable tasks from the busiest domain/group/CPU to the current CPU.

No‑hz load balancer : when a busy CPU detects idle CPUs, it sends an IPI via the GIC to wake an idle CPU, which then performs balancing on behalf of all idle CPUs.

New idle load balancer : when a CPU is about to become idle, it checks if other CPUs need help and pulls tasks from busy CPUs.

The fundamental load‑balancing process starts from the base domain, finds the busiest scheduling group, selects the busiest CPU’s runqueue as the source, chooses tasks with the highest load, and migrates them to the destination CPU’s runqueue.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

load balancing sched_domain

Written by

Refining Core Development Skills

Fei has over 10 years of development experience at Tencent and Sogou. Through this account, he shares his deep insights on performance.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.