Fundamentals 19 min read

Understanding Linux Process States, Scheduling, and Priority – A Deep Dive

This article explains Linux process states, how the kernel linked list works, commands for inspecting processes, the nature of zombie and orphan processes, priority handling, context switching, and the O(1) scheduler in the 2.6 kernel, providing a comprehensive overview for system developers.

Raymond Ops
Raymond Ops
Raymond Ops
Understanding Linux Process States, Scheduling, and Priority – A Deep Dive

1. Introduction

Process state is an integer stored in task_struct. When a process is running, it is in the scheduling queue; when it is blocked, it waits for a device or resource. State changes cause the process to move between different queues, which is essentially adding, deleting, or modifying data structures.

Understanding Kernel Linked List

在这里插入图片描述
在这里插入图片描述

If a structure contains multiple next and prev pointers, any task_struct can belong to the run queue, the global list, or even a binary tree simultaneously.

在这里插入图片描述
在这里插入图片描述

2. Process States

A process can have several states (in the Linux kernel a process is also called a task).

The following states are defined in the kernel source:

/*
*The task state array is a strange "bitmap" of
*reasons to sleep. Thus "running" is zero, and
*you can test for combinations of others with
*simple bit tests.
*/
static const char * const task_state_array[] = {
    "R (running)",   /*0 */
    "S (sleeping)", /*1 */
    "D (disk sleep)",/*2 */
    "T (stopped)",   /*4 */
    "t (tracing stop)",/*8 */
    "X (dead)",      /*16 */
    "Z (zombie)",    /*32 */
};
R Running or Runnable Status description: The process is executing on the CPU or waiting in the run queue for scheduling. Trigger scenario: The process is active, executing or ready to execute. S Interruptible Sleep Status description: The process is waiting for an event (e.g., I/O, signal) and can be interrupted by a signal. Trigger scenario: Calls such as sleep() or read() that block the process. D Uninterruptible Sleep Status description: The process waits for an uninterruptible operation (e.g., hardware I/O) and does not respond to signals. Trigger scenario: Common during disk I/O or certain kernel operations. T Stopped Status description: The process is paused by a signal such as SIGSTOP or SIGTSTP and needs SIGCONT to resume. Trigger scenario: Manual pause (e.g., Ctrl+Z ) or debugging. Z Zombie Status description: The process has terminated but the parent has not called wait() to reap it. Trigger scenario: The parent fails to handle the child's exit status. t Tracing Stop Status description: The process is stopped by a debugger (e.g., gdb ) during tracing. Trigger scenario: Breakpoints or single‑step execution. X Dead Status description: The child has exited and the parent has not yet collected its exit information. Trigger scenario: After the parent reaps the child, the entry disappears.

2.1 Viewing Process States

Command: ps aux /

ps axj
a: Shows all users' processes, including those of other users (requires appropriate permissions). By default, ps shows only the current user's processes.
x: Shows processes without a controlling terminal, typically background daemons. Combined with a as ps ax to display all processes regardless of terminal.
j: Displays job‑control information such as process group ID (PGID), session ID (SID), parent PID (PPID), and job number.
u: Shows detailed user‑centric information: user, CPU and memory usage, virtual and resident size, controlling terminal, state, start time, CPU time, and command line. Usually used with aux as ps aux .

2.2 Zombie Processes

在这里插入图片描述
在这里插入图片描述

A zombie state (Zombies) occurs when a process exits but its parent has not called wait() to read the exit code. The zombie remains in the process table, waiting for the parent to collect its status.

The exit status must be retained so the parent can learn the outcome of its child.

If the parent never reads the status, the child stays in the Z state, consuming a task_struct entry and potentially causing memory waste.

2.3 Orphan Processes

Example code creates a child that runs for 5 seconds while the parent exits after 1 second. The orphaned child is adopted by systemd (PID 1). If not adopted, it would become a zombie and could leak memory.

The parent does not become an orphan because it itself has a parent ( bash). Once a process becomes an orphan, PID 1 adopts it, turning it into a background process that cannot be killed with Ctrl+C; it must be terminated with kill.

3. Process Priority

3.1 Concept

The cpu resource allocation order is the process priority.

Scarce resources require priority to decide which process runs first.

Higher‑priority processes get execution rights; setting priority in Linux can improve overall system performance.

Priority is a numeric attribute in task_struct; lower numbers mean higher priority.

Priority vs. permission: priority decides order, permission decides whether a process can obtain the resource.

The real priority is PRI (default 80) plus the nice value ( NI). Adjusting nice changes the effective priority.

3.2 Viewing System Processes

Key fields: UID: identifies the user who started the process. PID: the process identifier. PPID: the parent process identifier. PRI: the static priority (lower = higher priority, default 80). NI: the nice value that modifies PRI. Real priority = PRI + NI.

In Linux, every resource access is performed by a process, and the process carries the UID of the user who launched it. Comparing the process UID with the file UID determines ownership.

3.3 Commands to View Priority

Use top and press r to change a running process's nice value.

Other commands: nice, renice, and the system call setpriority.

3.4 Supplementary Concepts – Competition, Independence, Parallelism, Concurrency

Competition: Many processes contend for limited CPU resources, requiring priority to resolve conflicts.

Independence: Processes run independently without interfering with each other.

Parallelism: Multiple processes execute simultaneously on multiple CPUs.

Concurrency: Multiple processes share a single CPU by context switching, allowing each to make progress over time.

4. Process Switching

A running process does not occupy the CPU forever; the scheduler assigns a time slice. When the time slice expires or the process blocks, the kernel saves the current CPU register state into the process's stack (part of task_struct) and restores the registers of the next task – this is a context switch.

The saved state includes all CPU registers; the kernel distinguishes new from already‑scheduled processes by a flag in task_struct.

5. Linux 2.6 O(1) Scheduler

Each CPU has a runqueue. The scheduler maintains two arrays of queues: an active queue for processes whose time slice has not expired and an expired queue for those whose slice has run out.

Ordinary priority range: 100–139 (corresponds to nice values).

Realtime priority range: 0–99.

The active queue is an array queue[140] where each index represents a priority level; processes of the same priority are ordered FIFO. The scheduler scans the array from index 0 to find the first non‑empty queue, guaranteeing O(1) selection.

A bitmap of 5×32 bits indicates which queues are non‑empty, speeding up the search.

The expired queue has the same structure; when all active processes are exhausted, the scheduler recalculates time slices for expired processes and swaps the active and expired pointers.

This O(1) algorithm, combined with the nice value, ensures that older processes keep their relative priority while allowing dynamic adjustments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Linuxprocess schedulingProcess StatespriorityO(1) Scheduler
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.