Linux Kernel Performance Profiling: A Comprehensive Guide to On-CPU and Off-CPU Analysis
This comprehensive guide explains Linux kernel performance profiling—both on‑CPU and off‑CPU—by stressing the need to target the critical 3 % of code, covering throughput, latency and power metrics, scalability laws, flame‑graph visualizations, perf and eBPF tools, lock‑contention analysis, and further reading recommendations.
This article provides an in-depth guide to Linux kernel performance profiling and optimization, starting with Donald Knuth's famous quote that "premature optimization is the root of all evil." The author emphasizes that performance optimization should focus on identifying the critical 3% of code that actually creates bottlenecks, using profiling as the most important step.
The article covers three fundamental performance metrics: throughput (measured by tools like netperf, sysbench, vm-scalability), latency (which exhibits multi-modal distribution requiring histogram analysis rather than simple averages), and power consumption. It discusses the Universal Scalability Law (USL) and Amdahl's Law, explaining how contention (σ) and coherency (k) coefficients impact scalability as CPU core count increases.
For on-CPU analysis, the article explores flame graphs for visualizing CPU consumption, perf report for generating text-based reports, perf annotate for identifying hot code lines, and the top-down microarchitecture analysis method (CPI/IPC, Front End Bound, Back End Bound, Retiring, Bad Speculation). The author explains how to use Intel VTune Profiler and pmu-tools for top-down analysis.
For off-CPU analysis, the article covers histogram creation using kernel tracepoints and eBPF/BCC tools (biolatency, funclatency, runqlat), off-CPU flame graphs using the offcputime.py tool, and lock contention analysis using perf lock. The article provides practical examples with code and demonstrates how to analyze kswapd behavior, memory reclaim, and lock contention scenarios.
The author recommends two books for further reading: Brendan Gregg's "System Performance Enterprise and the Cloud" and Denis Bakhvalor's "Performance Analysis and Tuning on Modern CPUs."
OPPO Kernel Craftsman
Sharing Linux kernel-related cutting-edge technology, technical articles, technical news, and curated tutorials
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.