Master Linux Filesystem Performance Debugging with strace, iostat, and pidstat
This guide walks through using strace together with lsof, vmstat, iostat, and pidstat in a four‑step workflow to pinpoint Linux filesystem bottlenecks, covering VFS core calls, system load checks, disk utilization, per‑process I/O analysis, and detailed file‑operation tracing.
VFS Core Operations
VFS centralizes syscalls such as open, read, write, close, stat, fsync. Monitoring latency and frequency of these calls helps locate filesystem bottlenecks. Additional calls like mmap and access can be examined when needed.
Typical focus: open/close – frequency and open latency. read/write – IOPS, throughput, buffer sizes. stat/lstat/fstat – metadata retrieval. fsync – data flush latency.
Four‑step performance analysis
Step 1 – System overview
Collect overall load, CPU usage and memory pressure with top or free. Example output from top:
top - 11:19:22 up 790 days, 4 users, load average: 0.00, 0.00, 0.00
%Cpu(s): 6.8 us, 3.3 sy, 0.0 ni, 89.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 2041048 total, 428584 free, 246764 used, 1365700 buff/cache
KiB Swap: 14401532 total, 14029564 free, 371968 usedKey observations:
Low load average → no CPU saturation.
CPU in kernel mode (sy) ~3 % indicates moderate system‑call activity.
I/O wait (wa) near 0 % – values >10 % usually signal I/O pressure.
Idle CPU (id) ~90 % → ample CPU capacity.
Step 2 – Disk utilization
Use iostat -x -d 1 to monitor per‑device metrics ( %util, avgqu‑sz, latency). Sample output for sda:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq‑sz avgqu‑sz await r_await w_await svctm %util
sda 0.00 7.94 0.03 14.19 0.88 98.64 14.00 0.00 0.07 0.37 0.07 0.01 0.01 0.01If %util stays high, identify the processes responsible (next step).
Step 3 – Per‑process I/O
List processes with significant read/write rates using pidstat -d 1 or iotop. Example pidstat excerpt:
11:41:10 AM UID PID kB_rd/s kB_wr/s kB_ccwr/s iodelay Command
11:41:11 AM 0 4855 0.00 29.63 0.00 0 bkunifylogbeat
11:41:11 AM 0 4893 0.00 3.70 0.00 0 exceptionbeat
11:41:11 AM 0 24244 0.00 7.41 0.00 0 qemu-system-x86
...High kB_rd/s or kB_wr/s values pinpoint offending PIDs. Use ps -p <PID> -o pid,cmd to view full command lines.
Step 4 – Detailed syscall tracing
Attach strace to the target PID to capture file‑related syscalls: strace -e trace=file -tt -T -p <PID> Important options: -e trace=file – trace only file‑related syscalls. -tt – print timestamps with microseconds. -T – show duration of each syscall. -p <PID> – attach to the process. -c – summary statistics. -o <file> – write output to a file. -f – follow forked children. -s 1024 – increase maximum string length.
Example tracing stat calls:
strace -e trace=stat -p 3442
stat("promote", 0x7ffd312ccf10) = -1 ENOENT (No such file or directory)
stat("fallback_promote", 0x7ffd312ccf10) = -1 ENOENT ...If read() or write() only show file descriptors, map them to paths with lsof -p <PID>:
lsof -p 60274
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
qemu-syst 60274 root cwd DIR 8,1 4096 2 /
qemu-syst 60274 root txt REG 8,1 22519968 1587004 /usr/bin/qemu-system-x86_64Combining strace output with lsof information enables a thorough investigation of the root cause of filesystem performance problems.
Following these four steps—system overview, disk utilization, per‑process I/O, and detailed syscall tracing—provides a systematic method for locating and resolving Linux filesystem bottlenecks.
Tech Stroll Journey
The philosophy behind "Stroll": continuous learning, curiosity‑driven, and practice‑focused.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
