Understanding Linux Process Management and the CFS Scheduler: From Fork to Fair Scheduling
This article explains Linux process concepts, creation methods like fork, vfork and clone, termination handling, and the inner workings of the Completely Fair Scheduler (CFS), including virtual runtime, red‑black tree organization, priority changes, and wake‑up compensation.
1. What is a Linux Process?
In Linux, a process is an executing instance of a program, the basic unit of resource allocation and scheduling. Each process has a unique PID, its own memory space (code, data, stack), and can be in states such as running, blocked, or ready.
1.1 Process PCB
Linux describes a process with a PCB called task_struct, which contains fields such as mm, fs, files, signal, etc.
Root directory is a process concept, not a system concept.
File descriptors (fd) are also per‑process.
PID is a global identifier.
Commands to view limits:
ulimit -u # maximum number of processes
ulimit -a # show all limits2. Linux Process Creation
Processes can be created by executing a program or by system calls. The most common system call is fork(), which creates a child that is a copy of the parent. Variants include vfork() and clone().
2.1 fork()
Prototype: pid_t fork(void);. The child receives PID 0, the parent receives the child's PID. The kernel uses copy‑on‑write (COW) to share pages until one process writes.
#include <stdio.h>
#include <unistd.h>
int main() {
pid_t pid = fork();
if (pid == 0) {
printf("Child PID: %d
", getpid());
} else if (pid > 0) {
printf("Parent PID: %d, child PID: %d
", getpid(), pid);
} else {
perror("fork");
}
return 0;
}2.2 vfork()
vfork()creates a child that shares the parent's address space until it calls exec() or _exit(). It is useful when the child immediately replaces its image.
2.3 clone()
int clone(int (*fn)(void *), void *child_stack, int flags, void *arg);can create processes or threads with fine‑grained sharing of resources.
2.4 Kernel Threads
Kernel threads run in kernel space only, have mm = NULL, and are scheduled like normal tasks.
3. Process Termination
A process ends either normally (return from main or exit()) or abnormally (signals, segmentation fault, etc.). Zombie processes remain in the process table until the parent calls wait(). Orphaned processes are adopted by init (PID 1).
3.5 exit() vs _exit()
exit()runs atexit handlers, flushes stdio, and closes descriptors. _exit() terminates immediately without cleanup.
4. Linux Scheduler (CFS)
The Completely Fair Scheduler (CFS) assigns each runnable task a virtual runtime ( vruntime) weighted by its nice value. The task with the smallest vruntime is selected next.
4.1 vruntime
vruntimeis the weighted execution time; it grows slower for higher‑priority tasks.
4.2 Updating vruntime
On each timer tick the kernel calls update_curr() → __update_curr(), which adds the weighted delta to vruntime and checks for preemption.
4.3 Selecting the next task
CFS stores tasks in a red‑black tree keyed by vruntime. The left‑most node (minimum vruntime) is chosen via pick_next_task_fair().
4.4 Wake‑up handling
When a sleeping task wakes, place_entity() reduces its vruntime by a threshold derived from sysctl_sched_latency, giving I/O‑bound tasks a quick response.
4.5 Changing priority
Changing a task’s nice value calls set_user_nice(), which dequeues the task, updates its weight via set_load_weight(), and re‑enqueues it, causing a new position in the tree.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
