Unlock Linux Performance Secrets with SystemTap: A Practical Guide
SystemTap provides Linux developers and administrators a powerful, low‑overhead dynamic tracing solution that lets you monitor kernel and user‑space events, pinpoint performance bottlenecks, debug crashes, and build custom monitoring without rebooting or recompiling, making invisible system issues visible and solvable.
1. What is SystemTap?
SystemTap is a powerful Linux dynamic tracing tool that acts like a "transparent lens" for developers and ops engineers. It can capture kernel and user‑space events—function calls, system calls, memory allocations, network packets—without rebooting or modifying kernel code.
1.1 Overview
SystemTap lets you quickly locate performance bottlenecks and diagnose hard‑to‑track faults by inserting probes dynamically, avoiding the time‑consuming compile‑and‑restart cycle.
2. Core Concepts
2.1 Probes
Probes are like monitoring cameras placed at specific events (function entry/exit, system call, timer). When the event occurs, the associated handler runs.
probe syscall.open {
printf("%s opened %s
", execname(), filename)
}2.2 Handlers
Handlers contain the actions executed when a probe fires, such as printing information, counting, or storing data.
2.3 Tapset
Tapsets are libraries of predefined probes and functions that simplify script writing for common analysis scenarios (network, performance, etc.).
3. SystemTap Architecture
SystemTap scripts are translated into C code, compiled into a kernel module, loaded, and executed. The workflow includes script writing, translation (stap), compilation to a .ko module, loading, data collection, and cleanup.
4. Installation & Execution
4.1 Installation
On Debian/Ubuntu: sudo apt-get install systemtap On CentOS/RHEL: sudo yum install systemtap Kernel debug info packages may be required for deep tracing.
4.2 Running Scripts
Run a script file: stap test.stp Run with verbosity: stap -v test.stp Read from stdin: stap - One‑liner:
stap -e 'probe syscall.open {printf("%s opened %s
", execname(), filename)}'Make script executable with chmod +x test.stp and run directly.
5. Script Syntax & Examples
5.1 Basic Structure
probe probe_point [, probe_point] {
handler_statement
}5.2 Variables
Variables are implicitly typed; global declares globals. Example counting write syscalls:
global total_writes = 0
probe syscall.write {
total_writes++
printf("Total write syscalls so far: %d
", total_writes)
}5.3 Conditional & Loops
probe syscall.open {
if (execname() == "target_process") {
printf("%s opened %s
", execname(), filename)
}
}Foreach loops iterate over associative arrays.
6. Advanced Features
6.1 Conditional Filtering
probe syscall.* if (pid() == 1234) {
printf("%s called syscall %s
", execname(), name)
}6.2 Associative Arrays & Statistics
Use global associative arrays to aggregate data, e.g., count reads per process/file.
global read_count[execname(), filename]
probe syscall.read { read_count[execname(), filename]++ }
probe end { foreach ([proc, file] in read_count) {
printf("%s read %s %d times
", proc, file, read_count[proc, file])
}}6.3 Embedding C Code
Embed C with %{ … %} and enable guru mode (-g). Example retrieving process info in a vfs_read return probe:
function getprocname:string(task:long) %{ ... %}
function getprocid:long(task:long) %{ ... %}
probe kernel.function("vfs_read").return {
task = pid2task(pid())
printf("vfs_read return: %p, pid: %d, getprocname: %s, getprocid: %d
", $return, $return->pid, getprocname(task), getprocid(task))
}7. Application Scenarios
7.1 Performance Bottleneck Analysis
Trace function execution time, e.g., do_sys_open:
probe kernel.function("do_sys_open") { t = gettimeofday_us() }
probe kernel.function("do_sys_open").return { printf("open took %d us
", gettimeofday_us() - t) }7.2 Memory Leak Detection
Track malloc/free counts with a global associative array to report unreleased allocations at script end.
7.3 Network Packet Analysis
Monitor ICMP packets to/from a target IP using embedded C helpers to extract IP headers.
global TARGET_IP = "192.168.1.100"
function ip_to_int(ip_str) %{ ... %}
function get_ip_protocol:long(skb) %{ ... %}
probe kernel.function("ip_rcv") {
// extract protocol, src, dst, compare with TARGET_IP
if (protocol == 1 && (saddr == target || daddr == target)) {
printf("[RCV %05d] %s ICMP %s -> %s
", pid(), ctime(gettimeofday_s()), ip_ntop(htonl(saddr)), ip_ntop(htonl(daddr)))
}
}
probe begin { println("[+] Tracing ICMP to/from: ", TARGET_IP) }7.4 Tips & Debugging
Use -v for verbose output, -pN to stop after specific compilation stages, and -g when embedding C code. Avoid heavy I/O in handlers to prevent performance impact.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
