Why Is My Kubernetes Pod Dropping Packets? A Step‑by‑Step Diagnosis
This guide walks through a real‑world Kubernetes incident where a pod experienced packet loss, detailing how to identify the impact scope, observe drop patterns, trace the veth pair, capture traffic with tcpdump, and resolve the issue by disabling the unnecessary lldpd service.
Cause
The incident began with a red alert in the CAT monitoring dashboard, revealing packet loss on a specific pod after checking the container overview panel.
Symptoms
The affected pod showed continuous packet drops, but business traffic remained unaffected, indicating the issue was isolated to the network layer.
Root‑Cause Analysis
Determine impact scope : Identify which pods, nodes, or clusters exhibit drops. In this case, the problematic pods were concentrated on a single node.
Find commonality : Verify whether the pod’s business traffic is impacted (it was not) and note that the node’s packet loss persisted for a long time without user complaints.
Real‑time observation : Monitor drop frequency inside the pod. The drop count increased by one every 30 seconds.
watch -n 1 cat /sys/class/net/eth0/statistics/rx_dropped</code><code># same observation with other commands</code><code>ifconfig eth0 | grep drop</code><code>cat /proc/net/dev</code><code>netstat -iLocate the veth pair :
Inside the pod, get the peer index: cat /sys/class/net/eth0/iflink On the host, find the corresponding veth interface: ip a | grep ${index} Alternative host lookup : Use routing information to map the pod IP to the host interface. route -n | grep ${pod_ip} Packet capture : Capture traffic on the host’s veth interface and analyze with Wireshark.
tcpdump -i calif33e3f0e409 -nn -w /tmp/container.pcapObserving the pattern “packet loss increases every 30 s” pointed to LLDP traffic as suspicious.
tcpdump -i calif33e3f0e409 ether proto 0x88cc -vvSupplementary verification : Run standard diagnostics (dmesg, /var/log/messages, history). History revealed that lldpd had been installed and started only on the faulty host.
Solution
Since the container environment does not require lldpd, stop the service to eliminate the packet loss.
systemctl stop lldpdFollow‑up Actions
Add pod packet‑loss monitoring.
Optionally monitor lldpd status, add it to regular inspections, and block installation of lldpd via bastion hosts.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
