Master Linux Performance: Load, CPU Context Switches, Memory & Swap Optimization
This guide explains Linux performance fundamentals—including throughput, latency, average load, CPU context switching, memory management, swap behavior, and the essential monitoring tools such as vmstat, pidstat, perf, and strace—while providing concrete command examples and troubleshooting steps.
Performance Optimization Overview
High concurrency and fast response correspond to two core performance metrics: throughput and latency.
Application‑load perspective: directly affects end‑user experience.
System‑resource perspective: resource utilization, saturation, etc.
The essence of a performance problem is that system resources have reached a bottleneck while request processing is still too slow, causing high load. Performance analysis aims to locate the bottleneck and mitigate it.
Choosing Metrics
Set performance goals for the application and the system.
Conduct performance baseline testing.
Use profiling to pinpoint bottlenecks.
Implement monitoring and alerting.
Different problems require different analysis tools. Below is a list of common Linux performance tools and the types of issues they address.
Understanding "Average Load"
Average load is the average number of processes in runnable or uninterruptible states during a time interval; it is not directly related to CPU utilization.
Uninterruptible processes are those currently in kernel‑mode critical paths (e.g., waiting for I/O). This state protects processes and hardware.
When Is the Load Reasonable?
In production, monitor average load over time. If the load rises sharply, set thresholds (e.g., 70 % of CPU count) and investigate.
CPU‑intensive workloads raise both load and CPU usage.
I/O‑intensive workloads raise load while CPU usage may stay low.
Heavy scheduling (many processes waiting for CPU) also raises load.
Use mpstat or pidstat to identify the source of the load.
CPU Context Switching
What Is a Context Switch?
A CPU context switch saves the registers and program counter of the current task, loads the registers of the next task, and jumps to its entry point. The saved state resides in the kernel until the task is scheduled again.
Types of Context Switches
Process context switch
Thread context switch
Interrupt context switch
Process Context Switch
When a user‑space process invokes a system call, the kernel performs two context switches: one to enter kernel mode and another to return to user mode. System calls are therefore also called privileged‑mode switches.
During a process switch, the kernel saves the process’s virtual memory mappings and stack before loading the next process’s kernel state.
Thread Context Switch
Threads belonging to the same process share the same virtual memory, so switching between them only requires saving/restoring registers and private thread data, which is cheaper than a full process switch.
Interrupt Context Switch
Interrupt switches occur entirely in kernel mode; they save only the CPU registers, kernel stack, and hardware interrupt parameters. Interrupt handling has higher priority than process scheduling, so the two do not occur simultaneously.
Observing Context Switches
Use vmstat to view overall switch and interrupt rates: vmstat 5 # output every 5 seconds Key columns: cs – total context switches per second. in – interrupts per second. r – length of the run queue (processes ready for CPU).
For per‑process details, use pidstat -w: pidstat -w 5 Fields cswch/s and nvcswch/s represent voluntary and involuntary switches respectively.
CPU Usage Cases
High CPU by a Single Application
Run top or ps to locate the offending process, then drill down with perf top -p <pid> to find hot functions.
perf top -g -p 1234 # show call graph for PID 1234Example: a PHP‑FPM worker was calling a test function containing a million‑iteration loop; removing the loop reduced Nginx load dramatically.
High System CPU with No Visible Process
If top shows high CPU but no process stands out, examine the run‑queue length ( r column). A large value indicates many processes are ready to run. Use pstree to find short‑lived processes that may be spawning and exiting quickly. pstree | grep stress Often such short‑lived processes are invoked by a parent application (e.g., PHP‑FPM) to simulate I/O pressure.
Memory Fundamentals
How Linux Manages Memory
Only the kernel can access physical RAM. Each process receives a contiguous virtual address space split into kernel and user regions. The kernel maintains page tables to map virtual pages to physical frames.
When a process accesses a virtual page that is not present, a page‑fault occurs; the kernel allocates a physical page, updates the page table, and resumes the process.
Virtual Address Layout
Read‑only segment – code and constants.
Data segment – global variables.
Heap – dynamically allocated memory (grows upward).
Memory‑mapped region – shared libraries, mmap files (grows downward).
Stack – local variables and call frames (fixed size, typically 8 MiB).
Allocation Mechanisms
mallocuses two kernel interfaces: brk() for small allocations (< 128 KiB) by moving the program break. mmap() for large allocations (> 128 KiB) via memory‑mapped files.
Both allocate virtual memory; physical pages are only committed on first access, which may trigger a page‑fault.
Reclaiming Memory
Cache reclamation via LRU.
Swapping out rarely used pages.
OOM killer terminating the biggest memory consumers.
Adjust /proc/sys/vm/swappiness to control how aggressively the kernel swaps.
Inspecting Memory Usage
Use free for system‑wide memory, top/ps for per‑process details, and pidstat -r for granular statistics: pidstat -r 1 10 Key columns: VSZ – virtual size (KB). RSS – resident set size (physical KB). %MEM – proportion of total RAM. minflt/s / majflt/s – minor/major page‑fault rates.
Detecting Memory Leaks
Run the application inside a container, then use BCC’s memleak tool: /usr/share/bcc/tools/memleak -a -p $(pidof app) The tool reports allocations that were never freed and shows the call stack, allowing you to pinpoint the leaking function (e.g., a recursive fibonacci implementation).
Swap Analysis
When RAM is scarce, the kernel writes anonymous pages to swap. High swap usage can also appear on NUMA systems where one node is memory‑starved while others have free RAM.
Check swap with free and monitor with sar -r -S 1. Use dstat to correlate I/O spikes with swap activity.
sudo dd if=/dev/sda1 of=/dev/null bs=1G count=2048 # generate heavy I/OIf swap rises while iowait spikes, investigate disk‑bound processes with pidstat -d or strace -p <pid>. Direct I/O (O_DIRECT) bypasses page cache and can cause high iowait despite low CPU usage.
Tool Selection Matrix
Choose tools based on the metric you need:
Overall load & CPU: top, vmstat, pidstat Context switches: vmstat → pidstat -w Per‑process CPU: pidstat -u, perf top Memory & page faults: pidstat -r, free, memleak I/O & swap: dstat, iostat, sar -r -S Kernel‑level tracing: strace, perf record/report, bcc tools Start with broad tools (top/vmstat/pidstat) to locate the symptom, then drill down with specialized utilities (strace, perf, memleak) to find the root cause.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
