How to Precisely Pinpoint Performance Bottlenecks in the Linux I/O Stack
This article provides a comprehensive walkthrough of the Linux I/O stack architecture, explains key performance metrics, demonstrates how to use tools such as iostat, pidstat, strace and lsof, and presents a step‑by‑step case study for locating I/O bottlenecks on a high‑traffic web server.
Part 1 – Linux I/O Stack Overview
1.1 I/O Stack Architecture Overview
The Linux I/O stack is organized like a multi‑storey building with three main layers: the file‑system layer (the front desk for user‑level operations), the generic block layer (the middle‑floor scheduler that merges and orders requests), and the device layer (the foundation that talks directly to storage hardware).
1.2 File‑System Layer
The Virtual File System (VFS) acts as a translator between concrete file‑system implementations (e.g., ext4, XFS) and applications, exposing a uniform API. ext4 is noted for stability and wide adoption, while XFS excels with large files in data‑center environments.
struct file {
union {
struct llist_node fu_llist;
struct rcu_head fu_rcuhead;
} f_u;
struct path f_path;
struct inode *f_inode; /* cached value */
const struct file_operations *f_op;
struct mutex f_pos_lock;
loff_t f_pos;
}Key structures such as struct dentry (directory entry) and struct inode (index node) are also described, along with the superblock that records overall file‑system metadata.
1.3 Generic Block Layer
This layer performs request merging (combining adjacent I/O operations) and I/O scheduling. The CFQ algorithm provides fair per‑process queues, while the Deadline algorithm prioritises requests that have reached their deadline, benefiting workloads like databases.
1.4 Device Layer
Device drivers translate kernel I/O requests into hardware commands. Mechanical HDDs are characterised by high latency on random accesses, whereas SSDs deliver low latency for both sequential and random I/O but may suffer from write‑amplification and garbage‑collection overhead.
Part 2 – I/O Performance Metrics and Interpretation
2.1 File‑System I/O Metrics
Space usage – high utilisation slows file creation and can cause errors.
Inode usage – exhausted inodes prevent new files.
Cache hit rate – higher rates reduce disk accesses.
IOPS – measures how many I/O operations can be processed per second.
Response time – lower values improve user experience.
Throughput – amount of data transferred per unit time.
2.2 Disk I/O Metrics
Disk utilisation (%util) – sustained values above 80 % indicate overload.
IOPS – critical for random‑read/write workloads such as OLTP.
Throughput – important for sequential large‑file workloads like video streaming.
Response time – overall latency of I/O operations.
Part 3 – Practical Tools for Bottleneck Identification
3.1 iostat – Disk I/O Insight
Install via sudo apt‑get install sysstat (Debian/Ubuntu) or sudo yum install sysstat (CentOS/RHEL). Run iostat -x 1 to view extended statistics refreshed each second.
Linux 4.15.0 - 20 - generic (hostname) 2023-10-31 _x86_64_ (2 CPU)
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq‑sz avgqu‑sz await svctm %util
da 0.000 0.500 30.000 13.000 120.000 62.000 5.000 0.020 0.600 0.030 5.00Key fields: rrqm/s, wrqm/s (merged requests), r/s, w/s (read/write rates), await (average wait), and %util (utilisation).
3.2 pidstat – Per‑Process I/O Analysis
Run pidstat -d 1 to display block‑device I/O per process each second.
Linux 4.15.0 - 20 - generic (hostname) 2023-10-31 _x86_64_ (2 CPU)
14:00:01 UID PID kB_rd/s kB_wr/s kB_ccwr/s Command
14:00:01 0 12345 1024 512 0 mysqld
14:00:01 0 23456 0 2048 0 javaThe output shows per‑process read/write rates, enabling quick identification of high‑I/O processes.
3.3 Additional Helpers
top– real‑time system resource monitor; high %iowait often signals I/O bottlenecks. strace -p [PID] – traces system calls of a specific process, revealing slow read / write calls. lsof -p [PID] – lists open files for a process, helping spot excessive file accesses.
Part 4 – Hands‑On Bottleneck Diagnosis
4.1 Scenario and Problem Statement
A Linux‑based web server for a high‑traffic e‑commerce site experiences slow page loads. Initial suspicion points to an I/O performance bottleneck.
4.2 Investigation Steps and Reasoning
① System Overview: Using top, the %iowait metric hovers around 40 % (normal < 10 %). CPU and memory usage appear normal, focusing attention on I/O.
② Disk I/O Evaluation: iostat -x 1 shows disk utilisation > 85 % and low IOPS, with await exceeding 50 ms (normal < 10 ms). This confirms a severe disk I/O bottleneck.
③ Process‑Level I/O Check: pidstat -d 1 reveals a php‑fpm process performing > 5000 KB/s reads and > 2000 KB/s writes, far above other processes.
④ Deep Dive with strace and lsof: strace -p on php‑fpm shows prolonged read and write system calls. lsof -p lists frequent accesses to log and cache files. The analysis concludes that inefficient read/write patterns on these files overload the disk, causing the observed performance degradation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
