How to Trace Linux Packet Drops with eBPF and kfree_skb_reason
This article explains why packets are dropped in Linux, introduces the kfree_skb_reason API added in kernel 5.17, and shows step‑by‑step how to use bpftrace to capture drop reasons, five‑tuple details, and stack traces for precise network debugging.
Packet loss on Linux servers can arise from interface buffer overflow, high CPU load, mis‑configured kernel parameters, iptables rules, driver bugs, or unsuitable congestion‑control algorithms. Starting with Linux 5.17 the kernel provides a diagnostic entry point: kfree_skb_reason, which replaces the older kfree_skb and reports the drop cause via an enum skb_drop_reason defined in include/net/dropreason-core.h. Key enum values include SKB_NOT_DROPPED_YET (0), SKB_CONSUMED (1), SKB_DROP_REASON_NOT_SPECIFIED (2), SKB_DROP_REASON_NO_SOCKET (3), SKB_DROP_REASON_PKT_TOO_SMALL (4), SKB_DROP_REASON_TCP_CSUM, SKB_DROP_REASON_SOCKET_FILTER, SKB_DROP_REASON_UDP_CSUM, SKB_DROP_REASON_NETFILTER_DROP, and others up to SKB_DROP_REASON_MAX (42). The function signature is:
void kfree_skb_reason(struct sk_buff *skb, enum skb_drop_reason reason) {
// kernel implementation …
}Tracing kfree_skb_reason with bpftrace
The following bpftrace script captures every invocation of kfree_skb_reason, extracts the five‑tuple (source/destination IP and ports), the protocol, the drop reason, and prints a kernel stack trace.
#!/usr/bin/env bpftrace
#include <linux/skbuff.h>
#include <linux/ip.h>
#include <net/sock.h>
BEGIN {
printf("Tracing packet drops. Hit Ctrl‑C to end.
");
printf("%-8s %-8s %-16s %-10s %-16s %-21s %-21s
",
"TIME", "PID", "COMM", "IP_PROTO", "REASON", "SADDR:SPORT", "DADDR:DPORT");
}
kprobe:kfree_skb_reason {
$pid = pid;
$reason = arg1;
// Filter out uninteresting reasons (e.g., NOT_DROPPED_YET, CONSUMED, and the sentinel MAX)
if ($reason <= 2 || $reason >= 42) { return; }
$reason_str = "UNKNOWN";
if ($reason == 0) $reason_str = "SKB_NOT_DROPPED_YET";
else if ($reason == 1) $reason_str = "SKB_CONSUMED";
else if ($reason == 2) $reason_str = "NOT_SPECIFIED";
else if ($reason == 3) $reason_str = "NO_SOCKET";
else if ($reason == 4) $reason_str = "PKT_TOO_SMALL";
// …additional mappings up to 42…
else if ($reason == 42) $reason_str = "SKB_DROP_REASON_MAX";
$skb = (struct sk_buff *)arg0;
$sk = (struct sock *)$skb->sk;
$family = $sk->__sk_common.skc_family;
if ($family == AF_INET) {
$daddr = ntop($sk->__sk_common.skc_daddr);
$saddr = ntop($sk->__sk_common.skc_rcv_saddr);
} else {
$daddr = ntop($sk->__sk_common.skc_v6_daddr.in6_u.u6_addr8);
$saddr = ntop($sk->__sk_common.skc_v6_rcv_saddr.in6_u.u6_addr8);
}
$lport = $sk->__sk_common.skc_num;
$dport = bswap($sk->__sk_common.skc_dport);
$ipheader = (struct iphdr *)($skb->head + $skb->network_header);
$protocol_str = "UNKNOWN";
if ($ipheader->protocol == 1) $protocol_str = "ICMP";
else if ($ipheader->protocol == 6) $protocol_str = "TCP";
else if ($ipheader->protocol == 17) $protocol_str = "UDP";
// …other protocol mappings…
time("%H:%M:%S ");
printf("%-8d %-16s %-10s %-16s ", pid, comm, $protocol_str, $reason_str);
printf("%15s:%-6d %15s:%-6d
", $saddr, $lport, $daddr, $dport);
printf("%s
", kstack);
}Running the script
Save the script as packet_drop.bt and execute it with: bpftrace packet_drop.bt In a second terminal trigger a drop, for example by connecting to a non‑existent TCP port with nc: nc 127.0.0.1 9999 The bpftrace output prints a line similar to:
12:34:56 12345 curl TCP SKB_DROP_REASON_NO_SOCKET 192.168.1.10:54321 192.168.1.20:9999
[stack trace …]This shows the exact process ( curl), protocol ( TCP), drop reason ( SKB_DROP_REASON_NO_SOCKET), and the full five‑tuple, followed by the kernel stack that led to the drop. By correlating the stack trace with kernel source, developers can pinpoint the code path responsible for the loss and adjust configuration, firewall rules, or driver settings accordingly.
Key takeaways
Instrument kfree_skb_reason with a kprobe to obtain deterministic drop reasons.
Map the numeric enum skb_drop_reason values to human‑readable strings inside the probe.
Extract IPv4/IPv6 addresses and ports from the associated sock structure.
Translate the IP header's protocol field to a readable name (ICMP, TCP, UDP, …).
Print a timestamp, PID, command name, protocol, reason, five‑tuple, and kernel stack for each drop event.
Use the output to drive concrete debugging actions (e.g., adjust iptables, fix driver bugs, tune congestion‑control algorithms).
All steps rely only on standard kernel headers and the bpftrace tool, making the workflow reproducible on any Linux 5.17+ system without additional instrumentation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
BirdNest Tech Talk
Author of the rpcx microservice framework, original book author, and chair of Baidu's Go CMC committee.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
