Operations 15 min read

Master Linux CPU, Memory, I/O, and Network Performance with Essential Commands

This guide explains how to monitor Linux system performance by using core commands such as top, vmstat, iostat, and sar to evaluate CPU usage, memory allocation, I/O activity, and network traffic, while also covering load interpretation, cache behavior, huge pages, zero‑copy techniques, and practical command examples.

Liangxu Linux

Oct 23, 2022

Master Linux CPU, Memory, I/O, and Network Performance with Essential Commands

CPU

Use top to view real‑time CPU usage. Press 1 after launching to display per‑core statistics.

us – user‑mode CPU percentage.

sy – kernel‑mode CPU percentage.

ni – time spent on low‑priority user processes.

wa – I/O wait time.

hi – hardware interrupt handling.

si – software interrupt handling.

st – time stolen by the hypervisor (relevant for VMs).

id – idle CPU percentage.

Load average

top shows three load values representing the average number of runnable or waiting processes over the last 1, 5, and 15 minutes. The meaning of a load value depends on the number of CPU cores: on a single‑core system a load of 1 means full utilization, while on a 16‑core system the same load leaves most cores idle.

vmstat

The vmstat command provides additional CPU and memory statistics. Important columns: b – number of processes blocked waiting for I/O. cs – context‑switch count. si / so – swap‑in and swap‑out activity.

$ vmstat 1
procs ---------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
34  0    0 200889792  73708 591828    0    0     0     5    6   10 96  1  3  0  0
...

Memory

Observation commands

The top display includes three memory columns:

VIRT – virtual memory size (usually large, not a primary concern).

RES – resident memory actually used by the process.

SHR – shared memory, such as common libraries.

CPU cache and false sharing

Modern CPUs have multi‑level caches. False sharing occurs when multiple threads modify variables that reside on the same cache line, causing unnecessary cache‑line invalidations.

Cache line size can be read from:

cat /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size

Typical cache sizes for a Linux system:

# cat /sys/devices/system/cpu/cpu0/cache/index1/size
32K
# cat /sys/devices/system/cpu/cpu0/cache/index2/size
256K
# cat /sys/devices/system/cpu/cpu0/cache/index3/size
20480K

In Java 8+, the JVM flag -XX:-RestrictContended enables the @sun.misc.Contended annotation to pad fields and mitigate false sharing.

HugePages

TLB entries are limited; using larger page sizes (HugePages) reduces the number of entries required for large memory allocations. The default page size is 4 KB; increasing it can improve mapping efficiency.

Pre‑touch

The JVM option -XX:+AlwaysPreTouch forces the JVM to allocate and touch all heap pages at startup, reducing page‑fault latency during runtime at the cost of slower startup.

I/O

Disk I/O is typically the slowest subsystem. The iostat command provides detailed metrics; a %util value above 80 % indicates the device is near saturation.

%util – device utilization (100 % means fully loaded).

Device – identifier of the disk.

avgqu‑sz – average queue length (shorter is better).

await – average I/O latency; values >10 ms suggest bottlenecks.

svctm – average service time per I/O operation.

Zero‑copy

Zero‑copy techniques such as sendfile avoid copying data between user space and kernel space, reducing CPU usage and improving throughput. Traditional file‑to‑socket transfer involves four copies: file → kernel, kernel → user, user → kernel, kernel → socket. Zero‑copy eliminates the user‑space copy.

Network

The sar utility can monitor network traffic. Example for interface statistics:

$ sar -n DEV 1
Linux 3.13.0-49-generic (host) 07/14/2015 _x86_64_ (32 CPU)
12:16:48 AM IFACE   rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil
12:16:49 AM eth0    18763.00 5032.00 20686.42 478.30 0.00 0.00 0.00 0.00
...

TCP‑specific metrics can be obtained with sar -n TCP,ETCP 1, showing active connections, segments per second, retransmissions, etc.

Conclusion

These metrics give a high‑level view of system health but cannot pinpoint every performance issue. For deeper analysis, specialized tools such as eBPF‑based BCC utilities are recommended.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance network I/O Linux CPU Memory

Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.