How Much CPU Time Does a Linux Context Switch Really Cost?
This article measures the CPU time consumed by Linux process and thread context switches, explains the underlying operations, compares simple token‑passing tests with lmbench benchmarks, and shows how to monitor and interpret switch statistics on a production server.
Process abstraction hides CPU scheduling and memory management, but each context switch incurs overhead because the kernel must save the state of the current execution context and restore the state of the next one.
Process and Process Switching
A context switch saves the current process's registers, page‑table base (CR3), kernel stack pointer, and other architectural state, then loads the target process's saved state. The cost is negligible for low‑frequency switches but becomes noticeable on servers that handle thousands of requests per second.
Simple Process‑Switch Overhead Test
The test creates two processes that exchange a token through a pipe. Each process blocks on read() or write() until the other side sends the token back. After many iterations the elapsed time is divided by the number of exchanges to obtain the average switch time.
# gcc main.c -o main
# ./main
Before Context Switch Time 1565352257 s, 774767 us
After Context Switch Time 1565352257 s, 842852 usRepeated runs on the author’s machine give an average of ≈3.5 µs per process context switch . The exact value varies with CPU model, kernel version, and whether the two processes run on the same core.
Breakdown of Switch Overhead
Overhead can be divided into:
Direct costs : page‑table switch (CR3), kernel‑stack switch, loading/restoring general‑purpose registers (IP, BP, SP, etc.), flushing the TLB, and executing the scheduler’s code.
Indirect costs : loss of cache and TLB locality when the new process runs on a different core, causing additional memory accesses and pipeline stalls.
Professional Benchmark – lmbench
lmbench is an open‑source multi‑platform benchmark that measures process creation, destruction, and context‑switch latency. A typical run on a Linux 2.6.32 system produced the following table (values are in microseconds):
-------------------------------------------------------------------------
Host OS 2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw ctxsw
--------- ------------- ------ ------ ------ ------ ------ ------- -------
bjzw_46_7 Linux 2.6.32- 2.78 2.78 2.70 4.38 4.04 4.75 5.48lmbench reports process‑switch latencies ranging from 2.7 µs to 5.48 µs , reflecting both direct and indirect costs.
Thread Context Switch Overhead
Linux threads are implemented as lightweight processes that share the same address space. The same token‑passing technique was applied with 20 threads communicating via a pipe. The program was compiled with the pthread library: # gcc -lpthread main.c -o main Typical output shows an average of ≈3.8 µs per thread switch , indicating that thread switches are only marginally faster than full process switches on this platform.
Monitoring Context Switches on a Live System
Standard Linux utilities can reveal the rate of voluntary and involuntary switches: vmstat 1 – shows the cs column (context switches per second). sar -w 1 – reports total context‑switch rate ( cswch/s). pidstat -w 1 – breaks the switch rate down per PID.
Example vmstat snapshot from an 8‑core KVM VM running nginx + php‑fpm (1000 workers, ~100 req/s):
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 0 595504 5724 190884 0 0 295 297 0 0 14 6 75 0 4The cs column shows up to 40 k switches per second, which corresponds to roughly 20 ms of CPU time per second spent on switching.
Per‑process details can be obtained from /proc/<pid>/status:
# grep ctxt /proc/32583/status
voluntary_ctxt_switches: 573066
nonvoluntary_ctxt_switches: 89260On the same machine, pidstat -w 1 identified several php‑fpm workers as the main contributors, with most switches being voluntary (caused by I/O blocking) and a smaller fraction involuntary (time‑slice expiration).
Conclusion
On typical Linux systems the measured cost of a context switch lies between 2.7 µs and 5.5 µs . Developers can reproduce these numbers using the simple token‑passing program or the more comprehensive lmbench suite, and they can monitor live systems with vmstat, sar, or pidstat to detect excessive switching that may degrade performance.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
