Master Linux Performance: Essential CPU, Memory, I/O Metrics & Tools
This article explains how to monitor Linux system performance by covering CPU metrics with top and vmstat, memory usage via top columns and cache details, I/O health using iostat and zero‑copy techniques, as well as network statistics with sar, providing practical commands and interpretation guidance.
1. CPU
Introduce the CPU as the most important component; use top to observe performance.
1.1 top command
Run top and press 1 to see per‑core details.
Key metrics include:
us user‑mode CPU percentage.
sy kernel‑mode CPU percentage.
ni nice‑priority CPU percentage.
wa I/O wait percentage.
hi hardware interrupt percentage.
si software interrupt percentage.
st stolen time for virtualized environments.
id idle CPU percentage.
Load average shows the queue length; values depend on CPU core count.
1.2 Load average
Explain load values for single‑, dual‑, and quad‑core systems.
1.3 vmstat
Use vmstat to see CPU busy state; important columns are b (blocked processes), cs (context switches), and si/so (swap activity).
$ vmstat 1
procs ---------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
34 0 0 200889792 73708 591828 0 0 0 5 6 10 96 1 3 0 0
...2. Memory
2.1 Observation commands
Use top to view VIRT, RES, and SHR columns; RES shows actual memory used by a process.
2.2 CPU cache
CPU caches reduce the speed gap between CPU and memory; false sharing occurs when multiple threads modify variables on the same cache line.
Cache line size can be read with:
cat /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_sizeCache sizes for each level can be queried:
cat /sys/devices/system/cpu/cpu0/cache/index1/size # 32K
cat /sys/devices/system/cpu/cpu0/cache/index2/size # 256K
cat /sys/devices/system/cpu/cpu0/cache/index3/size # 20480K2.3 HugePage
HugePages increase page size (default 4 KB) to reduce TLB pressure on large memories.
2.4 Pre‑touch
JVM option -XX:+AlwaysPreTouch forces memory allocation at startup, improving runtime performance.
3. I/O
3.1 Observation commands
Disk I/O is often the bottleneck; iostat shows %util, avgqu‑sz, await, svctm, etc.
%util near 100 % means the device is saturated.
await should be < 5 ms; > 10 ms indicates problems.
3.2 Zero copy
Zero‑copy (e.g., sendfile) eliminates copying between kernel and user space, reducing CPU load.
4. Network
Use sar to monitor network traffic and TCP statistics.
$ sar -n DEV 1
...5. Conclusion
These metrics give a quick view of Linux performance, but deeper analysis may require advanced tools such as eBPF‑based BCC.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
