Unlocking Linux Observability: A Hands‑On Guide to eBPF with Real‑World Examples
This article introduces eBPF, explains its origins and how it extends BPF for kernel‑level observability, compares it with SystemTap and DTrace, outlines common use cases, details its loading‑compile‑execute workflow, and provides step‑by‑step Python/BCC examples with installation instructions and advanced latency measurement code.
What is eBPF?
eBPF (extended Berkeley Packet Filter) is a Linux kernel feature that allows user‑space programs to load custom bytecode into the kernel and attach it to kernel hooks (e.g., syscalls, network events). The bytecode is verified for safety, JIT‑compiled to native instructions, and executed in‑kernel without modifying the kernel source. It first appeared in Linux 3.18 (2014) and requires Linux 4.4+ for full functionality.
Comparison with SystemTap and DTrace
SystemTap and DTrace are external tracing tools that load kernel modules, while eBPF runs inside the kernel, offering lower overhead and tighter integration. All three can collect runtime data, but eBPF provides a programmable, safe interface for networking, security, and performance analysis.
Typical Use Cases
Network monitoring – capture packets and analyse traffic patterns.
Security filtering – block or alert on malicious packets.
Performance analysis – gather kernel metrics and visualise bottlenecks.
Virtualisation – monitor VM performance and balance loads.
How eBPF Works
The workflow consists of three steps:
Loading : a user‑space program calls bpf() (or a wrapper library) to load the eBPF program into the kernel.
Verification & compilation : the kernel verifier checks the program for safety; a JIT compiler translates the verified bytecode to native instructions.
Execution : the program is attached to a hook (e.g., tcp_sendmsg) and runs whenever the event occurs.
Simple Example – Counting TCP Packets
This Python script uses the BCC library to count calls to tcp_sendmsg:
#!/usr/bin/python3
from bcc import BPF
from time import sleep
bpf_text = """
#include <uapi/linux/ptrace.h>
BPF_HASH(stats, u32);
int count(struct pt_regs *ctx) {
u32 key = 0;
u64 *val, zero = 0;
val = stats.lookup_or_init(&key, &zero);
(*val)++;
return 0;
}
"""
b = BPF(text=bpf_text, cflags=["-Wno-macro-redefined"])
b.attach_kprobe(event="tcp_sendmsg", fn_name="count")
name = {0: "tcp_sendmsg"}
while True:
try:
for k, v in b["stats"].items():
print("%s: %d" % (name[k.value], v.value))
sleep(1)
except KeyboardInterrupt:
exit()The program creates a BPF hash map stats to store a counter, attaches the count function to the tcp_sendmsg kprobe, and prints the counter each second.
Installation on Ubuntu
Install the BCC Python bindings from the repository (sufficient for recent Ubuntu releases): sudo apt install python3-bpfcc If the packages are outdated, build BCC from source:
sudo apt purge bpfcc-tools libbpfcc python3-bpfcc
wget https://github.com/iovisor/bcc/releases/download/v0.25.0/bcc-src-with-submodule.tar.gz
tar xf bcc-src-with-submodule.tar.gz
cd bcc
sudo apt install -y python-is-python3 bison build-essential cmake flex git libedit-dev libllvm11 llvm-11-dev libclang-11-dev zlib1g-dev libelf-dev libfl-dev python3-distutils checkinstall
mkdir build && cd build
cmake -DCMAKE_INSTALL_PREFIX=/usr -DPYTHON_CMD=python3 ..
make
sudo checkinstallRunning the Simple Example
Save the script as netstat.py, make it executable, and run with root privileges:
chmod +x netstat.py
sudo ./netstat.pyThe output shows the number of tcp_sendmsg calls observed.
Advanced Example – Measuring TCP Latency
This script records timestamps on packet send and receive events to compute round‑trip latency:
#!/usr/bin/python3
from bcc import BPF
import time
bpf_text = """
#include <uapi/linux/ptrace.h>
#include <net/sock.h>
struct packet_t {
u64 ts, size;
u32 pid;
u32 saddr, daddr;
u16 sport, dport;
};
BPF_HASH(packets, u64, struct packet_t);
int on_send(struct pt_regs *ctx, struct sock *sk, struct msghdr *msg, size_t size) {
u64 id = bpf_get_current_pid_tgid();
struct packet_t pkt = {};
pkt.ts = bpf_ktime_get_ns();
pkt.size = size;
pkt.pid = id;
pkt.saddr = sk->__sk_common.skc_rcv_saddr;
pkt.daddr = sk->__sk_common.skc_daddr;
struct inet_sock *sockp = (struct inet_sock *)sk;
pkt.sport = sockp->inet_sport;
pkt.dport = sk->__sk_common.skc_dport;
packets.update(&id, &pkt);
return 0;
}
int on_recv(struct pt_regs *ctx, struct sock *sk) {
u64 id = bpf_get_current_pid_tgid();
struct packet_t *pkt = packets.lookup(&id);
if (!pkt) return 0;
u64 delta = bpf_ktime_get_ns() - pkt->ts;
bpf_trace_printk("tcp_time: %llu.%llums, size: %llu
", delta/1000, delta%1000%100, pkt->size);
packets.delete(&id);
return 0;
}
"""
b = BPF(text=bpf_text, cflags=["-Wno-macro-redefined"])
b.attach_kprobe(event="tcp_sendmsg", fn_name="on_send")
b.attach_kprobe(event="tcp_v4_do_rcv", fn_name="on_recv")
print("Tracing TCP latency... Hit Ctrl-C to end.")
while True:
try:
(task, pid, cpu, flags, ts, msg) = b.trace_fields()
print("%-18.9f %-16s %-6d %s" % (ts, task, pid, msg))
except KeyboardInterrupt:
exit()The program prints latency measurements for each TCP packet.
Available BCC Tools
bcc-tools – collection of ready‑made tracing utilities.
bpftrace – high‑level language for writing BPF programs.
tcptop – real‑time TCP traffic monitor.
execsnoop – tracks process execution.
filetop – monitors file‑system activity.
trace – generic function‑call tracer.
funccount – counts function calls.
opensnoop – watches file open operations.
pidstat – per‑process performance statistics.
profile – CPU usage profiler.
Further Reading
Brendan Gregg, BPF Performance Tools: Linux System and Application Observability .
eBPF official site: https://ebpf.io/ (maintained by Cilium).
Cilium BPF and XDP Reference Guide: http://docs.cilium.io/en/latest/bpf/.
Linux kernel BPF documentation: https://www.kernel.org/doc/html/latest/bpf/.
Awesome eBPF list on GitHub: https://github.com/zoidbergwill/awesome-ebpf.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
