Understanding Linux Scheduler: Structures, Policies, and Context Switch Mechanics
Learn how Linux selects the next process for CPU execution by exploring key scheduler structures such as task_struct, sched_class, runqueue, the various scheduling policies (CFS, RT, Deadline, etc.), the scheduling flow, and the low‑level context_switch mechanism that swaps address spaces and registers.
What is Scheduling?
Scheduling selects a process from the ready queue according to a scheduling algorithm to maximize CPU utilization.
Scheduler‑related Structures
task_struct
The scheduling‑related fields of task_struct include:
struct task_struct {
/* scheduling class abstraction */
const struct sched_class *sched_class;
struct sched_entity se; /* CFS entity */
struct sched_rt_entity rt; /* RT entity */
#ifdef CONFIG_CGROUP_SCHED
struct task_group *sched_task_group;
#endif
struct sched_dl_entity dl; /* Deadline entity */
unsigned int policy; /* scheduling policy */
/* ... other members ... */
};sched_class abstracts the scheduler; five classes exist:
Stop scheduler – highest priority, cannot be pre‑empted.
Deadline scheduler – uses a red‑black tree ordered by absolute deadline.
RT scheduler – maintains a queue per priority.
CFS scheduler – Completely Fair Scheduler with virtual runtime.
IDLE‑Task scheduler – runs an idle thread when no other task is runnable.
policy defines six possible policies that user space can request:
SCHED_DEADLINE – selects the Deadline scheduler.
SCHED_RR – round‑robin time‑slice scheduling.
SCHED_FIFO – first‑in‑first‑out, no time slice.
SCHED_NORMAL – selects the CFS scheduler.
SCHED_BATCH – batch‑oriented CFS scheduling.
SCHED_IDLE – lowest‑priority CFS scheduling.
Scheduling entities : struct sched_entity se – CFS entity for normal non‑real‑time tasks. struct sched_rt_entity rt – RT entity using RR or FIFO. struct sched_dl_entity dl – Deadline entity using EDF.
runqueue (rq)
Each CPU has a runqueue that holds three scheduler‑specific queues:
struct rq {
struct cfs_rq cfs; /* CFS queue */
struct rt_rq rt; /* RT queue */
struct dl_rq dl; /* Deadline queue */
struct task_struct *curr, *idle, *stop; /* current, idle, stop tasks */
/* ... other members ... */
};Tasks become scheduling entities and are inserted into the appropriate queue of the runqueue.
Scheduling Flow
The scheduler chooses the next process in two steps:
Set the reschedule flag – the kernel sets TIF_NEED_RESCHED in the current thread’s thread_info.flags under conditions such as timer tick, wake‑up, fork, load‑balancing, or nice value change.
Execute scheduling – if the flag is set, the kernel calls schedule() to perform a context switch. This involves both user‑mode and kernel‑mode preemption.
Context Switch (context_switch)
The actual switch occurs inside _schedule(), which calls pick_next_task() to select the next task_struct and then context_switch() to swap execution.
Context switching consists of two major parts:
Address‑space switch – the next task’s page‑global directory (PGD) is loaded into the TTBR0_EL1 register, allowing the MMU to translate user‑space addresses for the new process.
Register‑state switch – on ARM64, callee‑saved registers x19‑x28, fp, sp, and pc are saved from the previous task’s cpu_context and restored from the next task’s cpu_context. The next task’s task_struct address is placed in sp_el0 so that current can locate it.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
