Operations 8 min read

How to Monitor Server I/O Performance Using top, iostat, and iotop

This article explains the 2019 Alibaba Cloud IO HANG incident, defines IO HANG, and provides step‑by‑step guidance on using the Linux commands top, iostat, and iotop (including examples and key options) to monitor and troubleshoot server disk I/O performance.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
How to Monitor Server I/O Performance Using top, iostat, and iotop

On March 3, 2019, Alibaba Cloud suffered an IO HANG incident in the North China region, causing many ECS instances to become unresponsive and leading to widespread service outages; the provider later confirmed that slow disk read/write operations caused the problem.

IO HANG refers to extremely slow disk I/O that makes threads and processes hang, which can bring down servers, especially database services such as RDS and HybridDB where I/O speed directly impacts SQL execution.

To monitor server I/O, three common Linux commands are introduced: top, iostat, and iotop.

top command provides real‑time monitoring of CPU, memory, and process information. Running top displays columns such as PID, USER, PR, NI, VIRT, RES, SHR, S, CPU, MEM, TIME+, and COMMAND. The most important columns are explained in the table below:

PID    进程id
USER  进程所有者用户名
PR    优先级
NI    nice值
VIRT  进程使用的虚拟内存总量
RES   进程使用的未被换出的物理内存大小
SHR   共享内存大小
S     进程状态 (S=睡眠, T=跟踪, R=运行, Z=僵尸, D=不可中断的睡眠)
CPU   CPU时间统计
MEM   物理内存占比
TIME+ 进程使用的CPU时间总计(单位 1/100 秒)
COMMAND 命令行命令名

Common interactive keys include d (refresh interval), p (monitor specific PID), q (quit), S (cumulative mode), s (safe mode), i (hide idle processes), and c (show full command line).

iostat command monitors device‑level I/O load. A typical usage is $ iostat -d -k 2, where -d shows device statistics, -k forces kilobyte units, and 2 sets a 2‑second refresh interval. The output includes metrics such as tps, kB_read/s, kB_wrtn/s, and %util, which indicates how busy a disk is (100 % means fully utilized). If the command is missing, install it with yum install sysstat.

iotop command is the I/O‑focused counterpart of top. Running iotop shows per‑process I/O usage, indicating which processes are reading or writing and the amount of data transferred. An alternative tool is pidstat -d, which also reports per‑process I/O statistics.

In production environments, real‑time monitoring of server I/O is crucial, especially for database servers, because degraded I/O can slow down reads/writes, cause SQL latency, and ultimately lead to process hangs, database congestion, and server crashes.

Linuxtopserver performanceiotopiostatio monitoring
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.