Fundamentals 16 min read

Task Placement in the Linux CFS Scheduler: Scenarios, Code Framework, and Energy‑Aware Scheduling

The article explains how the Linux CFS scheduler places newly created or awakened tasks—during fork, exec, or wake‑up—by using sched domains and flags to invoke select_task_rq_fair, which then chooses among energy‑aware, least‑loaded, or idle‑sibling CPUs based on capacity, energy impact, and affinity.

OPPO Kernel Craftsman
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Task Placement in the Linux CFS Scheduler: Scenarios, Code Framework, and Energy‑Aware Scheduling

This article is the second part of a three‑article series on load balancing in the Linux kernel. It focuses on the task placement scenario, analyzing how the kernel scheduler distributes newly created or awakened tasks across CPUs to achieve balanced load.

The discussion is based on Linux kernel version 5.4.24 and assumes readers will refer to the source code while reading.

1. Task placement scenarios

Task placement occurs in three situations: (1) a process creates a child via fork , (2) a process starts execution via exec , and (3) a blocked process is woken up. In these cases the scheduler must decide on which CPU to place the task before it enters the runqueue.

2. Sched domain and flags

The kernel groups CPUs into hierarchical struct sched_domain objects. On a typical big.LITTLE system, the four little cores form a “little” domain and the four big cores form a “big” domain, both under a top‑level DIE domain. Each domain carries a set of flags (sd_flag) that indicate which load‑balancing actions are applicable. The three flags relevant to task placement are SD_BALANCE_WAKE , SD_BALANCE_FORK , and SD_BALANCE_EXEC .

3. Task‑placement code framework

All three scenarios ultimately invoke the function select_task_rq() defined in kernel/sched/core.c . For the CFS class the call is forwarded to select_task_rq_fair() , whose prototype is:

static int select_task_rq_fair(struct task_struct *p, int prev_cpu, int sd_flag, int wake_flags)

The sd_flag tells the scheduler which domain flags are set, while wake_flags (e.g., WF_SYNC ) provides additional information for wake‑up‑based placement.

4. Core selection functions

The placement logic eventually calls one of three helper functions:

find_energy_efficient_cpu() – the Energy‑Aware Scheduling (EAS) path.

find_idlest_cpu() – the “slow” path that picks the least‑loaded CPU.

select_idle_sibling() – the “fast” path that prefers an idle sibling sharing cache with the waker or previous CPU.

The choice among these paths depends on the sd_flag , whether EAS is enabled, and whether the task is classified as wake‑affine (i.e., it benefits from staying near the waking CPU).

5. Energy‑Aware Scheduling (EAS)

When the kernel is built with CONFIG_SCHED_ENERGY , the scheduler uses an Energy Model (EM) to estimate the energy impact of placing a task on a particular CPU. Each struct perf_domain groups CPUs of the same micro‑architecture; the model maps capacity levels to frequency‑power pairs. The function em_pd_energy() computes the estimated energy of a perf domain:

static inline unsigned long em_pd_energy(struct em_perf_domain *pd, unsigned long max_util, unsigned long sum_util)

It uses the highest utilization in the domain to select a frequency, then aggregates the energy of all CPUs, ignoring idle time.

EAS selects a candidate CPU with the largest spare capacity, evaluates the energy change if the task were placed there, and chooses the CPU that yields the smallest energy increase. If the energy saving compared with the previous CPU exceeds roughly 6 %, the new CPU is selected; otherwise the scheduler falls back to the slow or fast path.

6. Summary

The article provides a high‑level view of how the Linux CFS scheduler performs task placement, covering the three placement scenarios, the role of sched domains and flags, the three selection paths, and the Energy‑Aware Scheduling mechanism. Readers are encouraged to explore the kernel source for deeper details, especially the implementations of find_energy_efficient_cpu() , find_idlest_cpu() , and select_idle_sibling() .

Linux kernelenergy aware schedulingCFS schedulertask placementCPU load balancingsched domain
OPPO Kernel Craftsman
Written by

OPPO Kernel Craftsman

Sharing Linux kernel-related cutting-edge technology, technical articles, technical news, and curated tutorials

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.