Operations 14 min read

How to Diagnose Linux Kernel Crashes with kdump and Crash Tool

Learn step‑by‑step how to capture a Linux core dump with kdump, install the crash utility and debuginfo packages, and use commands like sys, bt, ps, files, and sym to trace the root cause of kernel panics, illustrated with a real‑world case study.

Liangxu Linux

Sep 29, 2024

How to Diagnose Linux Kernel Crashes with kdump and Crash Tool

1. Obtain a Core Dump

Linux core dumps (often called core dump or memory dump ) capture the memory state at the moment of a kernel crash. The most reliable way to generate them is using kdump , which must be installed, configured with appropriate kernel boot parameters, and enabled as a service.

2. Locate and Prepare the Dump

After a crash, the dump file /var/crash/vmcore appears in the /var/crash directory. To analyze it you need the crash utility (provided by Red Hat) and the matching debuginfo package for your kernel version, which supplies a vmlinux file containing full symbol information.

3. Using the Crash Utility

The crash tool offers many commands: log – view kernel log messages. bt – display a backtrace of the call stack. ps – list processes at the time of the crash. files – show files accessed by a process. sym – resolve an address to source code. sys – show basic system information.

Example to open the dump:

crash /usr/lib/debug/lib/modules/2.6.32-754.35.1.el6.x86_64/vmlinux /var/crash/vmcore

4. Inspect Basic System Info with sys

The sys command reveals kernel version, memory size, number of processes, and the panic message, e.g.,

BUG: unable to handle kernel paging request at ffffffffa0395070

. Typical causes include illegal memory access, out‑of‑memory, or hardware faults.

5. Examine the Call Stack with bt

The backtrace shows which function caused the crash. In the example, line 1 indicates the ss program triggered the fault, followed by a chain of kernel functions ending with machine_kexec that invoked kdump.

crash> bt
1: PID: 177488 TASK: ffff880435b92ab0 CPU: 2 COMMAND: "ss"
#0 [ffff880437c0b7e0] machine_kexec at ffffffff8104179b
#1 [ffff880437c0b840] crash_kexec at ffffffff810d7a52
#2 [ffff880437c0b910] oops_end at ffffffff81560310
... 
#8 [ffff880437c0bb40] page_fault at ffffffff8155f265
...

6. Identify the Faulting Process with ps

Running ps inside crash lists all processes at crash time. The output shows the ss process (PID 177488) as the culprit.

crash> ps
PID    PPID    CPU  TASK           ST %MEM VSZ RSS COMM
... 
177488 64302   1    ffff880436662ab0 RU 0.0 6280 568 ss
...

7. Find the Accessed File with files

First retrieve the struct file for the process, then use files -d to resolve the dentry address. The example reveals that ss accessed /proc/slabinfo.

crash> struct file.f_path ffff880436662ab0
f_path = { mnt = 0xffff880432adbe80, dentry = 0xffff880101cae5c0 }

crash> files -d 0xffff880101cae5c0
DENTRY           INODE           SUPERBLK TYPE PATH
ffff880101cae5c0 ffff880101c1d598 ffff88043a23e800 REG /proc/slabinfo

8. Resolve Addresses to Source with sym

The backtrace points to address ffffffff812ae3a9. Using sym shows it corresponds to strnlen+9 in string.c:407. Another address ffffffffffa0395070 resolves to hash_info_mempool_name in the vmsecmod driver.

crash> sym ffffffff812ae3a9
ffffffff812ae3a9 (T) strnlen+9 /usr/src/debug/kernel-2.6.32-754.35.1.el6/.../string.c:407

crash> sym ffffffffa0395070
ffffffffa0395070 (r) hash_info_mempool_name [vmsecmod]

9. Examine Kernel Logs with log

The log output shows loading and unloading of modules syshook_linux and vmsecmod, indicating the antivirus driver was active before the crash.

crash> log
... 
VMAGENTMOD: 3846: init_module: get into init_module, syshook_enable:1
VMAGENTMOD: 177350: cleanup_module: get into cleanup_module
...

10. Identify the Root Cause

Combining the evidence:

The ss tool reads /proc/slabinfo to obtain socket‑related memory statistics.

The antivirus driver vmsecmod allocated slab memory but failed to release it when the driver was unloaded.

After the driver was removed, ss still accessed a stale pointer in slabinfo, causing a page fault and kernel panic.

This bug resides in the antivirus client’s cleanup routine, which neglects to free its slab allocations.

11. Lessons Learned

Even a mature Linux kernel can crash due to third‑party kernel modules that mishandle memory. Properly installing matching debuginfo packages and using the crash utility enables systematic root‑cause analysis of such failures.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Linux crash tool KDUMP kernel debugging core dump system troubleshooting

Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.