Why D and Z Processes Stall Your CPU: Linux Performance Tuning Guide
The article explains the low‑level reasons why uninterruptible (D) and zombie (Z) processes inflate Linux load average, shows how to identify them with ps, wchan and /proc tools, and provides step‑by‑step diagnostics, handling strategies, kernel and I/O scheduler tweaks, and preventive measures to keep the system responsive.
What are D and Z processes
Linux defines five basic process states: running (R), interruptible sleep (S), uninterruptible sleep (D), stopped (T) and zombie (Z). D (TASK_UNINTERRUPTIBLE) indicates a task waiting for I/O to complete; signals—including kill -9 —cannot interrupt it. Because the load‑average calculation counts tasks in D state, a high load with low CPU utilization often means many D processes are blocked on I/O. Z (TASK_DEAD) is a zombie left in the process table after a child exits without being reaped; it consumes a PID but no memory. When the PID pool (default 32768) is exhausted, new processes cannot be forked.
State transitions
When a process issues an I/O request the kernel calls set_current_state(TASK_UNINTERRUPTIBLE), places the task on a wait queue and invokes schedule() to yield the CPU. The interrupt handler wakes the task when the I/O completes and the state returns to TASK_RUNNING. For zombies, the child calls exit(), the kernel sends SIGCHLD to the parent, and the parent must invoke wait() or waitpid() to reap the child. If the parent never reaps, the child remains in Z state. Orphan zombies are adopted by init (PID 1), which periodically calls wait() to clean them up.
Locating D/Z processes
Common commands:
ps -eo pid,stat,comm,wchan --sort=-%cpu | grep -E 'D|Z'The wchan column shows the kernel function the task is waiting on; values such as sync_write, __lock_page or wait_on_page_bit usually point to I/O problems. cat /proc/<pid>/stack Displays the exact kernel stack trace for the task.
ps -e -o stat | grep -c D # count D processes
ps -e -o stat | grep -c Z # count Z processesDiagnosing D processes
Run iostat -x 1 5 and examine %util (near 100 % means the disk is saturated) and await (high latency). If rd_ios and wr_ios stop increasing, the I/O is stuck.
Identify the device with lsblk and inspect /proc/<pid>/io.
Inspect what the process is waiting for: cat /proc/<pid>/wchan, cat /proc/<pid>/stack or perf record -g -p <pid> && perf report.
Diagnosing Z processes
ps -eo pid,ppid,stat,cmd | grep ZCheck the PPID column, trace up the parent chain until a living ancestor is found, then either send kill -s SIGCHLD <parent_pid> (if the parent handles the signal) or kill the parent with kill -9 <parent_pid>. Be cautious because the parent may be a critical service.
Handling D processes
Check hardware health: smartctl -a /dev/sdX and device status with lspci -vvv.
Inspect kernel logs for I/O errors: journalctl -k -b.
If NFS is involved, verify with mount -t nfs4 and nfsstat.
Replace failed disks, unmount with umount -f, or upgrade kernel/drivers.
As a last resort, trigger a reboot via the magic SysRq key: echo b > /proc/sysrq-trigger.
Never use kill -9 on a D process; it can corrupt data. Avoid a forced reboot without syncing caches ( sync). echo w > /proc/sysrq-trigger prints stacks of all uninterruptible tasks to dmesg. echo t > /proc/sysrq-trigger shows task states.
Handling Z processes
Check PID limits: cat /proc/sys/kernel/pid_max and cat /proc/sys/kernel/threads-max. If forks fail with “fork: Cannot allocate memory”, the PID space is likely exhausted.
Fix the parent code to reap children (e.g., install a SIGCHLD handler or use waitpid() in a loop).
Increase the PID ceiling: sysctl kernel.pid_max=65535.
If the parent cannot be fixed, kill the offending parent; init will adopt and reap the zombies.
System tuning
I/O scheduler selection
SSD: use none or noop (no reordering, lowest latency). HDD: use deadline or mq-deadline (guaranteed request deadlines).
# view current scheduler
cat /sys/block/sda/queue/scheduler
# set deadline scheduler
echo deadline > /sys/block/sda/queue/schedulerKernel parameters (add to /etc/sysctl.conf for persistence)
# reduce dirty page ratio
sysctl vm.dirty_ratio=10
sysctl vm.dirty_background_ratio=3
# extend hung_task timeout (seconds)
sysctl kernel.hung_task_timeout_secs=300
# enlarge PID limit
sysctl kernel.pid_max=65535Filesystem tuning
# disable atime updates to reduce metadata I/O
mount -o remount,noatime,nodiratime /mount/pointResource limits (in /etc/security/limits.conf )
* soft nproc 65535
* hard nproc 65535Health‑check script
#!/bin/bash
D_COUNT=$(ps -e -o stat | grep -c D)
Z_COUNT=$(ps -e -o stat | grep -c Z)
if [ $Z_COUNT -gt 10 ]; then
echo "Detected $Z_COUNT zombie processes" | mail -s "Zombie Alert" [email protected]
fiKubernetes example
livenessProbe:
exec:
command:
- sh
- -c
- "ps -eo stat | grep -q 'D' && exit 1 || exit 0"
initialDelaySeconds: 30
periodSeconds: 15
lifecycle:
preStop:
exec:
command: ["sh", "-c", "kill -SIGTERM 1 && sleep 10"]Diagnostic command reference
ps -eo pid,stat,comm,wchan– show process state and kernel wait location cat /proc/<pid>/stack – current kernel stack trace cat /proc/<pid>/wchan – kernel function the task is waiting on iostat -x 1 – I/O performance metrics dmesg | grep hung_task – check hung‑task timeout alerts kill -s SIGCHLD <ppid> – trigger parent to reap zombies strace -p <pid> – trace system calls of a process perf record -g -p <pid> – sample performance data with call graph ps -e -o stat | grep -c Z – count zombie processes
Example double‑fork daemon (prevents zombies)
pid_t pid = fork();
if (pid == 0) {
setsid();
pid = fork();
if (pid == 0) {
// business process
} else {
exit(0);
}
} else {
wait(NULL);
}Signal‑based zombie reaping (C example)
signal(SIGCHLD, SIG_IGN); // or install a handler
while ((pid = waitpid(-1, &status, WNOHANG)) > 0) {
// reap completed children
}Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Tech Stroll Journey
The philosophy behind "Stroll": continuous learning, curiosity‑driven, and practice‑focused.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
