Operations 10 min read

How to Diagnose a Stuck Linux Process: From ps to /proc Stack Tracing

This step‑by‑step guide shows how to locate a Linux process frozen in uninterruptible sleep by using tools such as top, ps, WCHAN, /proc files, strace, pstack, and kernel stack traces to pinpoint the offending system call and its root cause.

ITPUB
ITPUB
ITPUB
How to Diagnose a Stuck Linux Process: From ps to /proc Stack Tracing

Problem Discovery

A server alarm indicated that a process was completely hung; attaching gdb produced no response and log files gave no clues.

Initial Observation

Running top showed the process’s CPU usage at 0 %, suggesting it was blocked in kernel mode rather than looping in user space.

Attempted Tracing

Both strace and pstack hung, confirming the process was already deep inside the kernel.

Using ps and the WCHAN column

Command: ps -p $PID -o pid,stat,wchan,cmd displayed state “D” (uninterruptible sleep) and WCHAN value rpc_wa. State D means the kernel will not deliver signals, so the process cannot be killed.

ps output showing state D and WCHAN rpc_wa
ps output showing state D and WCHAN rpc_wa

Reading /proc for precise information

WCHAN function: cat /proc/$PID/wchan returns the exact kernel function where the task is sleeping.

Current system call: cat /proc/$PID/syscall prints a tuple; the first number is the syscall ID (e.g., 262) followed by its arguments.

Kernel call stack: cat /proc/$PID/stack shows the kernel‑mode backtrace.

/proc/$PID/wchan output
/proc/$PID/wchan output

Identifying the system call

The syscall ID 262 maps to newfstatat on a 64‑bit Linux system. The mapping can be verified in /usr/include/asm/unistd_64.h (e.g., grep -R "262" /usr/include/asm/unistd_64.h) or by consulting the man page:

man newfstatat
newfstatat

retrieves file metadata (similar to fstatat).

newfstatat man page
newfstatat man page

Analyzing the kernel stack

The top entry of /proc/$PID/stack matches the WCHAN value ( rpc_wa), confirming the exact kernel function where the task is sleeping. Deeper entries contain a chain of NFS‑related symbols such as nfs_file_open, nfs_lookup, and nfs_getattr, indicating the process was performing a network‑file‑system operation.

kernel stack showing NFS functions
kernel stack showing NFS functions

Root cause

The blocked newfstatat call was issued on a remote NFS mount. A network failure caused the underlying RPC to hang, leaving the process in an uninterruptible sleep (state D). Because the kernel cannot abort an RPC that is waiting for I/O, the process remained stuck until the network issue was resolved.

Key take‑aways

The /proc filesystem exposes per‑process kernel state (process state, WCHAN, current syscall, kernel stack).

State “D” together with a non‑empty WCHAN column immediately points to an uninterruptible sleep in a specific kernel function.

Inspecting /proc/$PID/syscall identifies the exact system call; cross‑referencing the syscall number with kernel headers yields its name.

The kernel stack ( /proc/$PID/stack) reveals the call chain that led to the block, useful for pinpointing the subsystem (e.g., NFS).

When a process is stuck in an RPC on a remote filesystem, the only remedy is to fix the underlying network or storage problem; the process cannot be safely killed.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Debuggingprocessprocfssystem-call
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.