Understanding Disk Performance Metrics and Process‑Level I/O on Linux
This guide explains Linux disk performance indicators—utilization, saturation, IOPS, throughput, and latency—their interrelationships, and how to monitor both overall and per‑process I/O using tools like iotop, pidstat, and the /proc filesystem.
Building on a previous discussion of disk workflow and common performance indicators, this article delves deeper into how those metrics interact and how to examine I/O at the process level to pinpoint performance issues.
1. Disk Performance Metrics
Utilization measures the percentage of time the disk is busy handling I/O during a sampling interval.
Saturation reflects the length of the I/O queue, i.e., the number of requests waiting for service.
IOPS (I/O operations per second) gauges the speed of request processing, especially important for workloads with many small files.
Throughput is the amount of data transferred per second and is closely tied to bandwidth.
Response time (latency) is the total time from issuing an I/O request to its completion, encompassing queueing, waiting, and actual disk processing.
These values are obtainable via the iostat tool.
Metric Relationships
Utilization vs. Saturation: When utilization climbs toward 100 %, the disk is heavily loaded and new requests begin to queue, causing saturation to rise sharply. However, high saturation can also occur with modest utilization if the disk’s capacity is limited or a fault causes request back‑log.
Utilization vs. IOPS/Throughput: Within the disk’s performance ceiling, increasing workload raises IOPS and throughput together with utilization. Once the ceiling is reached, further load may cause IOPS and throughput to plateau or drop while saturation and latency increase.
Latency vs. Utilization/Saturation: Low latency requires fast network transfer, minimal queuing (low saturation), and rapid disk processing (SSD > HDD). When utilization exceeds a threshold, saturation grows, leading to a rapid rise in latency and noticeable system sluggishness.
IOPS vs. Throughput: Small I/O requests (e.g., 4 KB) may hit the IOPS limit before throughput becomes a factor, whereas large requests may be bandwidth‑limited, hitting the throughput ceiling first.
In practice, as load (IOPS and throughput) rises, utilization increases smoothly while saturation stays low; once load reaches a critical point, latency and saturation begin to climb, and at 100 % load saturation spikes dramatically, causing latency to surge.
2. Process‑Level I/O Monitoring
Common tools for per‑process I/O inspection include pidstat, iotop, and the /proc filesystem.
# Ubuntu installation of pidstat and iotop
sudo apt install iotop
sudo apt install sysstat
# /proc is provided by the kernelUsing iotop
Run iotop -o -d 1 to view real‑time thread read/write speeds and I/O wait percentages.
Total DISK READ : 0.00 B/s | Total DISK WRITE : 499.55 K/s
Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 436.22 K/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
27464 be/4 root 0.00 B/s 3.52 K/s 0.00 % 0.76 % qemu-system-x86_64 ...
1792 be/3 root 0.00 B/s 66.84 K/s 0.00 % 0.01 % [jbd2/sda3-8]
913 be/4 root 0.00 B/s 182.93 K/s 0.00 % 0.00 % systemd-journaldDISK READ/DISK WRITE: Real‑time read/write speed (B/s, K/s, M/s). IO>: Percentage of time spent waiting for I/O; lower is better. SWAPIN: Percentage of time swapping to physical memory. TID: Thread ID. PRIO: I/O priority.
Using pidstat
Run pidstat -d 1 to continuously display per‑process I/O statistics such as kilobytes read ( kB_rd/s) and written ( kB_wr/s).
# Example output
Linux 4.15.0-58-generic (cs1ahyper01n07) 11/13/2025 _x86_64_ (64 CPU)
10:12:05 PM UID PID kB_rd/s kB_wr/s kB_ccwr/s iodelay Command
10:12:06 PM 0 913 0.00 3.70 0.00 0 systemd-journal
10:12:06 PM 0 2598 0.00 155.56 3.70 0 cron
...PID: Process ID. kB_rd/s: Kilobytes read per second. kB_wr/s: Kilobytes written per second. kB_ccwr/s: Kilobytes of cancelled writes per second.
Inspecting /proc/<pid>/io
The /proc/<pid>/io file provides cumulative I/O counters for a process since it started. Example:
# cat /proc/123/io
rchar: 145678 # characters read (including cache)
wchar: 234567 # characters written (including cache)
syscr: 1234 # read syscalls
syscw: 5678 # write syscalls
read_bytes: 45678 # actual bytes read from storage
write_bytes: 123456 # actual bytes written to storage
cancelled_write_bytes: 12345 # bytes of writes that were cancelledNote that these counters reset when the process restarts.
By combining overall disk metrics with per‑process observations, administrators can more precisely diagnose performance bottlenecks and plan further investigations, such as exploring factors that affect disk latency.
Tech Stroll Journey
The philosophy behind "Stroll": continuous learning, curiosity‑driven, and practice‑focused.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
