Identify the Process Writing to a File on CentOS 7 with SystemTap
This guide explains how to monitor disk usage on CentOS 7, collect snapshots with iostat, sar and pidstat, troubleshoot common errors, and finally use SystemTap to pinpoint the exact process that is writing to a specific file.
Background
CentOS 7 experienced a disk usage alert reaching 99% and the existing monitoring only provided summary information without per‑process I/O snapshots. To obtain detailed snapshots, the following commands were used: iostat -dx -k – view avgqu‑sz, await, svctm, %util sar -u – view %iowait, %user pidstat -d – view per‑process I/O read/write
Steps
Generate statistics file
cat>/tmp/at_task.sh<<EOF
pidstat -d 2 >/tmp/pidstat_`date +%F_%T`.log 2>&1 &
sar -u 2 >/tmp/sar_`date +%F_%T`.log 2>&1 &
while [ 1 ]; do echo -n `date +%T` >>/tmp/iostat_`date +%F`.log 2>&1 && iostat -dx -k 1 1 >>/tmp/iostat_`date +%F`.log 2>&1; sleep 2; done &
EOFThe while loop includes the current time so each line of iostat output is timestamped.
Schedule with at
at 15:14 today -f /tmp/at_task.shIf the atd service is not running, the error "Can't open /var/run/atd.pid to signal atd. No atd running?" appears. Restart the service: service atd restart Then re‑schedule the job:
at 15:14 today -f /tmp/at_task.sh
job 2 at Wed Mar 13 15:14:00 2019Collected snapshot examples
15:13:35Linux 3.10.0-862.14.4.el7.x86_64 (ip-xxxxx) 03/13/2019 _x86_64_ (4 CPU)
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq‑sz avgqu‑sz await r_await w_await svctm %util
vda 0.12 0.07 17.31 19.41 580.79 90.52 36.57 0.09 2.39 4.42 0.57 0.72 2.63 03:14:00 PM CPU %user %nice %system %iowait %steal %idle
03:14:02 PM all 0.25 0.00 0.38 0.00 0.00 99.37 03:14:00 PM UID PID kB_rd/s kB_wr/s kB_ccwr/s Command
03:14:02 PM 5700 9089 0.00 6.00 0.00 uxxxKilling the collection processes
ps -ef | egrep 'iostat|sar|pidstat|while' | grep -v grep | awk '{print $2}' | xargs -l killThe while loop itself does not appear in the ps output, so it continues to write to /tmp/iostat_… indefinitely.
Using lsof
lsof /tmp/iostat_2019-03-13returns nothing, while lsof can locate the process holding mysql‑error.log, demonstrating that lsof only works when a single inode is opened by a process.
Finding the writer with SystemTap
Install SystemTap: yum -y install systemtap Obtain the inode of the target file:
stat -c '%i' /tmp/iostat_2019-03-13
4210339Determine the device major/minor numbers:
ls -al /dev/vda1
brw-rw---- 1 root disk 253, 1 Jan 30 13:57 /dev/vda1Run the SystemTap script inodewatch.stp with the device numbers and inode:
stap /usr/share/systemtap/examples/io/inodewatch.stp 253 1 4210339Initial attempts fail due to missing kernel-devel and mismatched kernel headers. Install the appropriate kernel‑devel package and its debuginfo, then rebuild the SystemTap cache:
wget ftp://.../kernel-devel-3.10.0-862.14.4.el7.x86_64.rpm
rpm -ivh kernel-devel-3.10.0-862.14.4.el7.x86_64.rpm
debuginfo-install kernel-3.10.0-862.14.4.el7.x86_64If a module version mismatch persists, edit
/usr/src/kernels/3.10.0-862.14.4.el7.x86_64/include/generated/compile.hto update UTS_VERSION to the current kernel build string, then clear the SystemTap cache:
vim /usr/src/kernels/.../compile.h
# define UTS_VERSION "#1 SMP Wed Sep 26 15:12:11 UTC 2018"
rm -rf /root/.systemtap/cache/*After the fix, the script reports the writing process repeatedly, e.g.: iostat(4671) vfs_write 0xfd00001/4210339 To stop the continuous writes, use the correct method: create a simple script that runs the while loop, schedule it with at now + 1 minute, and then identify the process with ps -ef | grep iostat:
cat>/tmp/iostat.sh<<EOF
while [ 1 ]; do echo -n `date +%T` >>/tmp/iostat_`date +%F` 2>&1 && iostat -dx -m 1 1 >>/tmp/iostat_`date +%F` 2>&1; sleep 2; done &
EOF
at now + 1 minuteash /tmp/iostat.sh
ps -ef | grep iostat
# root 8593 1 0 16:16 pts/2 00:00:00 bash /tmp/iostat.shThis approach makes it easy to obtain the PID of the iostat process and manage it appropriately.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Linux
Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
