Operations 10 min read

Diagnosing Linux CPU Context Switch Problems with vmstat and pidstat

This article explains how excessive Linux CPU context switches affect system performance and shows step‑by‑step how to monitor and analyze them using vmstat, pidstat, and sysbench, including interpreting voluntary versus involuntary switches and interrupt statistics.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Diagnosing Linux CPU Context Switch Problems with vmstat and pidstat

In the previous article I discussed how Linux CPU context switches work, covering process, thread, and interrupt switches. This follow‑up explains how to analyze context‑switch issues.

Checking CPU Context Switches

Too many context switches waste CPU time saving and restoring registers, program counters, kernel stacks, and virtual memory, which can noticeably degrade performance. To inspect them you can use the vmstat tool.

vmstat

vmstat

is a common system‑performance utility that reports memory usage and also CPU context‑switch and interrupt counts.

Example command:

vmstat 5

The output columns of interest are: cs (context switch): number of switches per second. in (interrupt): number of interrupts per second. r (running | runnable): length of the run queue (processes ready or running). b (blocked): number of processes in uninterruptible sleep.

In the example the system shows 33 context switches, 25 interrupts, and both the run‑queue and blocked counts are 0, indicating an idle system.

pidstat

While vmstat gives a system‑wide view, pidstat provides per‑process details. Adding the -w option shows each process’s context‑switch statistics.

# Output interval is 5
$ pidstat -w 5
Linux 4.15.0 (ubuntu)  09/23/18  _x86_64_  (2 CPU)
08:18:26      UID       PID   cswch/s nvcswch/s  Command
08:18:31        0         1      0.20      0.00  systemd
08:18:31        0         8      5.40      0.00  rcu_sched
...

The two columns to note are cswch (voluntary switches per second) and nvcswch (involuntary switches per second).

Voluntary context switch : occurs when a process cannot obtain needed resources (e.g., I/O or memory shortage).

Involuntary context switch : occurs when a time slice expires and the scheduler forces a switch, common under heavy CPU contention.

Case Study

To see what constitutes a normal switch rate, we use sysbench (a multithreaded benchmark) to generate load. First, run vmstat on an idle system:

The idle output shows 35 context switches, 19 interrupts, and both r and b at 0.

Next, run a sysbench test with ten threads for 300 seconds:

$ sysbench --threads=10 --max-time=300 threads run

After the load starts, vmstat shows a dramatic increase:

The cs column jumps from 35 to 139 000 switches. Other observations: r: run‑queue length rises to 8. us + sy: user and system CPU usage together reach 100 %, with system usage at 84 %. in: interrupts climb to about 10 000, indicating interrupt handling pressure.

These metrics reveal a long run‑queue and heavy CPU usage caused by the benchmark.

Further analysis with pidstat -w -u 1 shows that sysbench consumes 100 % CPU, but many context switches also come from other processes such as kernel workers ( kworker) and sshd, especially involuntary switches.

# 1 means output interval is 1 second
# -w: output process switching index
# -u: output CPU usage index
$ pidstat -w -u 1
08:06:33      UID       PID    %usr %system  %guest %wait %CPU  CPU  Command
08:06:34        0     10488   30.00 100.00   0.00   0.00 100.00  0  sysbench
...
Note: By default pidstat shows process‑level switches; add -t to see thread‑level switches.

Interrupts

To investigate the high interrupt count, examine /proc/interrupts:

# -d: Highlight the change area
$ watch -d cat /proc/interrupts
           CPU0       CPU1
... 
RES:    2450431    5279697   Rescheduling interrupts
...

The fastest‑changing entry is the RES (rescheduling) interrupt, which wakes idle CPUs to schedule new tasks, confirming that excessive scheduling is the root cause.

What Is a Normal Switch Rate?

Typical stable systems see a few hundred to ten thousand switches per second. Values consistently above 10000 or a rapid increase usually signal performance problems.

Conclusion

By examining the type and frequency of context switches and interrupts, you can pinpoint whether the issue stems from resource‑waiting (voluntary switches), CPU contention (involuntary switches), or interrupt overload.

Many voluntary switches suggest processes are blocked on I/O or other resources.

Many involuntary switches indicate CPU bottlenecks and heavy contention.

Rising interrupt counts point to kernel‑level scheduling pressure; inspect /proc/interrupts for details.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceLinuxcontext switchSysbenchvmstatpidstat
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.