How Linux Process States Lead to Orphan & Zombie Issues in Docker and Kubernetes
This article explains the three‑ and five‑state process models, Linux task_struct states, and how orphan and zombie processes arise in containers, detailing PID namespaces, Docker's pause container, signal handling, and practical solutions for Kubernetes pods.
Three‑State Process Model
Processes are defined in at least three states during execution:
Running: the process holds the CPU and its program is executing.
Ready: the process has all resources except the CPU and will run as soon as the CPU is allocated.
Waiting (blocked or sleeping): the process lacks a required resource and cannot run until an event occurs.
Typical state transitions are:
Running → Waiting: waiting for I/O or manual intervention.
Waiting → Ready: resource becomes available.
Running → Ready: time slice expires or a higher‑priority process preempts.
Ready → Running: CPU becomes idle and a scheduler selects a ready process.
Five‑State Process Model
The five‑state model adds New and Exit states to the three‑state model.
New: the process is being created and has not yet entered the ready queue.
Exit: the process has finished or been terminated and will be reclaimed by the system.
Additional transitions include New → Ready, Running → Exit, and various paths to the Exit state.
Linux Process States
In the Linux kernel, both processes and threads are represented by the
task_structstructure, which is the basic scheduling unit.
Linux process states are:
TASK_RUNNING: ready or running state (R).
TASK_INTERRUPTIBLE: light sleep, can be awakened by signals (S).
TASK_UNINTERRUPTIBLE: deep sleep, does not respond to signals (D).
TASK_ZOMBIE: exited but not yet reaped by the parent (Z).
EXIT_DEAD: final exit state (X).
TASK_STOPPED: stopped for debugging (T).
Orphan Processes
An orphan occurs when a parent exits while one or more child processes are still running; the init process (PID 1) adopts these children and performs the necessary cleanup.
Zombie Processes
A zombie is a process that has terminated but still occupies an entry in the process table because its parent has not called
wait(). Excessive zombies waste PIDs and can prevent new processes from being created.
<code>[root@k8s-dev]# cat /proc/sys/kernel/pid_max
32768</code>Linux sets
pid_maxbased on CPU count: ≤32 CPUs → 32768; >32 CPUs → 1024 × CPU count.
To clean zombies, one can ignore
SIGCHLDin the parent or kill the parent so that init adopts and reaps the zombies.
Set the parent’s
SIGCHLDhandler to
SIG_IGN.
Fork twice and let the intermediate child become an orphan, which init will adopt.
Processes in Docker Containers
PID in Containers
Docker relies on Linux PID namespaces; each container gets its own namespace, forming a hierarchical tree where the root namespace is created at system boot.
Container Exit
When the PID 1 process inside a container exits, Docker destroys the corresponding PID namespace and sends
SIGKILLto any remaining processes.
docker stopfirst sends
SIGTERMto PID 1, waiting (default 10 s) before sending
SIGKILLfor a graceful shutdown.
docker killsends
SIGKILLby default.
Zombie Processes in Containers
Causes
In single‑process containers there is no traditional init; the application itself runs as PID 1, so if it exits without reaping children, zombies accumulate.
When the container’s main process receives
SIGTERM, Docker may kill the process but leave the container running.
Exited child processes become zombies until the parent (PID 1) calls
wait().
Solutions
Run a small init process (e.g.,
tini) that forwards signals and reaps zombies; Docker provides this via the
--initflag.
Kubernetes Pod Zombie Processes
Kubernetes shares a PID namespace among containers in a pod using a
pausecontainer, which acts as the init process for the pod.
Example pod spec (shareProcessNamespace: true):
<code>apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
shareProcessNamespace: true
containers:
- name: nginx
image: nginx
- name: shell
image: busybox
securityContext:
capabilities:
add:
- SYS_PTRACE
stdin: true
tty: true</code>Running
ps axinside the pod shows the
pauseprocess (PID 1) and other container processes as its children.
<code>/ # kubectl attach POD -c CONTAINER
/ # ps ax
PID USER TIME COMMAND
1 root 0:00 /pause
8 root 0:00 nginx: master process nginx -g daemon off;
14 101 0:00 nginx: worker process
15 root 0:00 sh
21 root 0:00 ps ax</code>The
pausesource (pause.c) installs a
SIGCHLDhandler that repeatedly calls
waitpid(-1, NULL, WNOHANG)to reap any zombie children.
<code>#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
static void sigdown(int signo) {
psignal(signo, "Shutting down, got signal");
exit(0);
}
static void sigreap(int signo) {
while (waitpid(-1, NULL, WNOHANG) > 0)
;
}
int main(int argc, char **argv) {
// ... register handlers for SIGINT, SIGTERM, SIGCHLD ...
for (;;) pause();
return 42;
}
</code>This mechanism ensures that the pause container, acting as the pod’s init, reaps orphaned zombies.
Conclusion
The pause container provides the foundation for shared namespaces in a pod.
It also functions as the init process, reaping zombie processes for all containers in the pod.
Ops Development Stories
Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.