Fundamentals 15 min read

Demystifying Linux Process States, Zombies, and the O(1) Scheduler

This article explains Linux process states stored in task_struct, how to interpret them with ps commands, the nature of zombie and orphan processes, priority handling via nice and top, and the inner workings of the Linux 2.6 O(1) scheduler including runqueues, active/expired queues, and bitmap optimizations.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Demystifying Linux Process States, Zombies, and the O(1) Scheduler

Process State Basics

In the Linux kernel, each process has a task_struct that contains an integer representing its state. A running process is marked running, while blocked processes wait for I/O or other resources. The state changes involve moving the process between various kernel queues, which are essentially data‑structure insertions and deletions.

Understanding Kernel Linked Lists

When a structure contains multiple next and prev pointers, a single task_struct can belong simultaneously to the run queue, a global list, or even a binary tree.

Linux Process States

R – Running or Runnable

State description: The process is executing on the CPU or waiting in the run queue for scheduling.

Trigger scenario: The process is active and ready to run.

S – Interruptible Sleep

State description: The process is waiting for an event (e.g., I/O, signal) and can be interrupted by signals.

Trigger scenario: Calls such as sleep() or read() cause this state.

D – Uninterruptible Sleep

State description: The process waits for non‑interruptible operations (e.g., disk I/O) and does not respond to signals.

Trigger scenario: Typical during low‑level I/O or kernel operations.

T – Stopped

State description: The process has been stopped by signals like SIGSTOP, SIGTSTP and can be resumed with SIGCONT.

Trigger scenario: Manual pause (Ctrl+Z) or debugging.

Z – Zombie

State description: The process has terminated but its parent has not yet called wait() to reap it.

Trigger scenario: Parent neglects to collect the child's exit status.

t – Tracing Stop

State description: The process is stopped by a debugger (e.g., gdb) during tracing.

Trigger scenario: Breakpoints or single‑step execution.

X – Dead

State description: The process has fully exited; its resources are being reclaimed.

Trigger scenario: After the parent has retrieved the child's status.

Viewing Process States with ps

Common commands: ps aux – Show all processes for all users. ps axj – Include job‑control information such as PGID, SID, PPID, and job number. ps u – Display detailed per‑process information (user, CPU%, MEM%, etc.).

Zombie Processes

A zombie (Z) state occurs when a child exits but the parent fails to call wait(). The zombie remains in the process table, holding its exit status until the parent reaps it, which can waste memory if many zombies accumulate.

Orphan Processes

If a parent exits before its children, the orphaned children are adopted by the init process (PID 1, typically systemd). Adopted orphans become regular processes; if they are not adopted, they would become zombies.

Process Priority

Priority determines the order in which the scheduler allocates CPU time. In Linux, a lower numeric value means higher priority. The effective priority is calculated as PRI (default) + NI (nice value). The nice range is –20 to 19, giving 40 distinct levels. PRI – Base priority (default 80). NI – Nice adjustment; the real priority equals PRI + NI.

Adjust priority with: nice – Launch a process with a specific nice value. renice – Change the nice value of an existing process.

Interactive adjustment via top: press “r”, enter the PID, then the new nice value.

CPU Registers and Context Switch

When a process runs, the CPU accesses the process’s code and data, storing temporary results in registers. A context switch saves the current CPU register state into the process’s task_struct stack, then restores the registers of the next scheduled process. This saved state is later used to resume execution.

Linux 2.6 O(1) Scheduler

The scheduler maintains a per‑CPU runqueue. Each runqueue contains:

Active queue – Processes whose time slice has not expired, organized by priority (indices 0‑139). The array queue[140] holds FIFO lists for each priority.

Expired queue – Processes whose time slice has expired; they are moved here until the active queue is empty.

Bitmap[5] – A 5‑word bitmap (5 × 32 = 160 bits) indicates which priority queues are non‑empty, allowing O(1) lookup of the highest‑priority runnable process.

Scheduling steps:

Start scanning queue[0] upward to find the first non‑empty priority bucket.

Select the first process in that bucket and run it.

When a process’s time slice ends, move it to the expired queue and recalculate its priority based on the current nice value.

When the active queue becomes empty, swap the active and expired pointers, making the expired queue the new active queue.

This design yields constant‑time scheduling decisions while still respecting priority and fairness.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KernelSchedulerProcess StatesO(1) Scheduler
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.