Comprehensive Guide to bpftrace: Features, Architecture, Installation, and Practical Use Cases
This article introduces bpftrace, an eBPF‑based dynamic tracing tool for Linux, explains its core concepts, technical architecture, installation methods, basic syntax, and demonstrates real‑world performance analysis, fault diagnosis, and security monitoring scenarios while comparing it with DTrace, SystemTap, and BCC.
In operations or development work, you may encounter Linux servers that become unexpectedly sluggish despite normal CPU, memory, disk I/O, and network usage, leaving you puzzled about where to start diagnosing the issue.
Traditional performance analysis tools often provide only surface‑level information, making it hard to uncover deep system behavior and hidden bottlenecks. bpftrace, built on eBPF technology, offers a powerful and flexible way to perform full‑stack, low‑overhead dynamic tracing and performance analysis.
1. What is bpftrace
bpftrace is an advanced dynamic tracing tool based on eBPF, designed for Linux environments. It allows developers, system administrators, and operations engineers to explore the inner workings of the kernel without modifying source code or loading kernel modules. eBPF acts as a tiny virtual machine inside the kernel that can execute user‑defined code safely.
Using a concise, awk‑ and C‑inspired DSL, bpftrace scripts can capture events such as process activity, file operations, network communication, memory usage, and CPU scheduling. For example, to count system calls of a specific process (PID 1234), you can run:
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_* /pid == 1234/ { @[probe] = count(); }'The script filters events by PID and aggregates counts per probe, helping quickly locate problematic system calls.
Key features of bpftrace:
Lightweight with minimal overhead, suitable for production tracing.
Powerful event capture across kernel and user space.
Flexible data processing (count, sum, histogram, etc.) and associative arrays.
Dynamic tracing that can add or remove probes at runtime.
2. Technical Principles
2.1 eBPF Virtual Machine
The eBPF VM provides a safe, efficient execution environment in the kernel. bpftrace scripts are compiled to eBPF bytecode, verified for safety (no infinite loops, illegal memory access), then JIT‑compiled to native machine code for high performance.
2.2 bpftrace Front‑end
The front‑end parses the DSL, builds an abstract syntax tree, and uses LLVM to generate eBPF bytecode. It then loads the bytecode via libbpf, attaching it to kernel hooks such as kprobes, uprobes, and tracepoints.
2.3 Tracing Mechanisms
bpftrace leverages kprobes (kernel function entry), uprobes (user‑space function entry/exit), and tracepoints to monitor a wide range of events. Example kprobe for file open:
sudo bpftrace -e 'kprobe:vfs_open { printf("File %s opened by process %s (PID %d)\n", str(args->dentry->d_name.name), comm, pid); }'Example uprobe for bash readline:
sudo bpftrace -e 'uprobe:/usr/bin/bash:readline { printf("User %d executed command: %s\n", uid, str(retval)); }'3. Installation and Usage
3.1 Installation Methods
Ubuntu (19.04+): sudo apt-get install -y bpftrace . Ubuntu 16.04+ via snap: sudo snap install --devmode bpftrace and sudo snap connect bpftrace:system-trace . Fedora (28+): sudo dnf install -y bpftrace . CentOS: add the repository and install with yum install bpftrace bpftrace-tools bpftrace-doc bcc-static bcc-tools .
3.2 Basic Syntax
Scripts follow the pattern probe /filter/ { action } , where probe can be tracepoint, kprobe, uprobe, etc.; filter is an optional condition; and action performs operations like printf , count() , or histogram generation.
Common commands include listing probes ( sudo bpftrace -l '*sleep*' ) and using built‑in variables (pid, comm, uid, nsecs) or custom/map variables prefixed with $ and @ .
3.3 Running Scripts
Single‑line execution with -e (e.g., counting all system calls) or script‑file execution by saving a .bt file and running sudo bpftrace script.bt .
4. Practical Cases
4.1 Performance Analysis
To diagnose a slow web server, count system calls:
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_* { @[probe] = count(); }'Then analyze read latency with a histogram:
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_read { @start[tid] = nsecs; } tracepoint:syscalls:sys_exit_read { $elapsed = nsecs - @start[tid]; @latency = hist($elapsed); delete(@start[tid]); }'4.2 Fault Diagnosis
Detect file deletions:
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_unlinkat { printf("%s deleted by process %s (PID %d)\n", str(args->pathname), comm, pid); }'Track process exits to find segmentation faults:
sudo bpftrace -e 'kprobe:do_exit { printf("Process %s (PID %d) exited with code %d\n", comm, pid, args->error_code); }'4.3 Security Monitoring
Monitor execve calls for unauthorized executions:
sudo bpftrace -e 'tracepoint:syscalls:sys_enter_execve { printf("Process %s (PID %d) executed %s\n", comm, pid, str(args->filename)); }'Track TCP connections:
sudo bpftrace -e 'tracepoint:tcp:tcp_connect { printf("TCP connect from %s:%d to %s:%d\n", ip(args->saddr), args->sport, ip(args->daddr), args->dport); } tracepoint:tcp:tcp_close { printf("TCP close from %s:%d to %s:%d\n", ip(args->saddr), args->sport, ip(args->daddr), args->dport); }'5. Comparison with Other Tools
5.1 bpftrace vs DTrace
DTrace is the original dynamic tracing system but is not natively available on Linux. bpftrace offers a simpler awk/C‑like syntax, comparable low overhead, and tighter integration with the Linux kernel via eBPF.
5.2 bpftrace vs SystemTap
SystemTap requires compiling scripts into kernel modules, which adds complexity and risk. bpftrace runs bytecode directly in the eBPF VM, providing higher safety and easier usage.
5.3 bpftrace vs BCC
BCC uses Python or C++ APIs and is suited for building complex BPF tools, while bpftrace provides a high‑level DSL for quick, ad‑hoc tracing, lowering the learning curve.
6. Precautions
6.1 Kernel Version
bpftrace requires Linux kernel 4.9 or newer; older kernels may lack necessary eBPF features.
6.2 Permissions and Security
Running bpftrace typically needs root or CAP_SYS_ADMIN. Scripts must be carefully written to avoid unsafe kernel accesses that could crash the system.
6.3 Performance Overhead
Although eBPF is efficient, overly aggressive tracing (e.g., frequent I/O or heavy computation in scripts) can impact system performance. Be mindful of kernel resource limits such as memory and instruction count.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.