Using tcpdump to Pinpoint Online Network Anomalies
This guide explains how tcpdump, built on libpcap and the kernel BPF filter, can capture packets at the network stack, compares it with Wireshark, shows practical filtering syntax, performance considerations, typical use‑cases such as TCP retransmission, DNS timeouts and TLS handshake failures, and provides scripts and best‑practice recommendations for production‑grade troubleshooting.
Overview
The ultimate method for diagnosing network faults is packet capture. When logs and coarse metrics provide no clues, tcpdump can deliver deterministic evidence by copying packets directly from the kernel.
Packet Capture Principle
Network card receives packet
|
Kernel network stack
|
BPF filter (tcpdump rule)
|
Match? – No → packet is dropped by the kernel (no impact on normal traffic)
|
Yes → copy to user‑space buffer
|
<code>tcpdump</code> reads and formats the outputBPF runs in kernel space, so filtering is highly efficient and does not affect normal traffic. tcpdump only mirrors packets; it never modifies or blocks the original flow.
Capture occurs before firewall processing, allowing visibility of packets dropped by INPUT rules.
tcpdump vs Wireshark
tcpdump : command‑line, ideal for servers, real‑time capture and saving to pcap files.
Wireshark : graphical, provides deep protocol analysis, stream tracking and statistics.
Capture Point in the Network Stack
Application Layer (HTTP/MySQL/DNS …)
|
Transport Layer (TCP/UDP)
|
Network Layer (IP)
|
┌───────────────────────────────────────┐
│ tcpdump capture point │
│ ← inbound: before iptables INPUT chain │
│ → outbound: after iptables OUTPUT chain│
└───────────────────────────────────────┘
|
Data‑link Layer (Ethernet)
|
Physical Layer (NIC)tcpdump can see inbound packets that are later DROPped by the firewall.
It cannot see outbound packets dropped by the OUTPUT chain because they never reach the capture point.
If a SYN is seen but no SYN‑ACK, the problem lies beyond the host or in an intermediate link.
Applicable Scenarios
TCP connection timeouts, retransmissions, RST events.
DNS resolution verification.
SSL/TLS handshake failures.
Application‑layer protocol anomalies (HTTP headers, MySQL protocol, etc.).
Packet loss, out‑of‑order delivery.
Firewall rule validation.
Load‑balancer health‑check debugging.
Environment Requirements
Ubuntu 24.04 LTS / Rocky Linux 9.5 with kernel 6.12+. tcpdump ≥ 4.99, libpcap ≥ 1.10.
Optional: Wireshark/tshark ≥ 4.x, termshark ≥ 2.x.
# Ubuntu 24.04 installation
sudo apt install -y tcpdump tshark
# Rocky Linux 9.5 installation
sudo dnf install -y tcpdump wireshark-cli
# Verify versions
tcpdump --version # tcpdump version 4.99.x
libpcap --version # libpcap version 1.10.x
# Allow non‑root capture (optional)
sudo setcap cap_net_raw,cap_net_admin=eip /usr/bin/tcpdumpDetailed Steps
Interface Selection
# List interfaces
tcpdump -D
# Capture on eth0
sudo tcpdump -i eth0
# Capture on all interfaces
sudo tcpdump -i any
# Disable DNS reverse lookup (recommended)
sudo tcpdump -i eth0 -nnProtocol Filtering
# TCP only
sudo tcpdump -i eth0 tcp
# UDP only
sudo tcpdump -i eth0 udp
# ICMP only
sudo tcpdump -i eth0 icmp
# ARP only
sudo tcpdump -i eth0 arpPort Filtering
# Specific port
sudo tcpdump -i eth0 port 80
# Source port
sudo tcpdump -i eth0 src port 80
# Destination port range
sudo tcpdump -i eth0 portrange 8080-8090
# Combine ports
sudo tcpdump -i eth0 port 80 or port 443IP Filtering
# Specific host
sudo tcpdump -i eth0 host 192.168.1.100
# Source IP
sudo tcpdump -i eth0 src host 192.168.1.100
# Network segment
sudo tcpdump -i eth0 net 192.168.1.0/24BPF Expression Details
BPF (Berkeley Packet Filter) is the filtering language used by tcpdump. It combines primitives (type, direction, protocol) with logical operators.
Common Quick Filters
# Exclude SSH traffic
sudo tcpdump -i eth0 -nn not port 22
# Capture only new connections (SYN)
sudo tcpdump -i eth0 -nn 'tcp[tcpflags] == tcp-syn'
# Capture only abnormal packets (RST or FIN)
sudo tcpdump -i eth0 -nn 'tcp[tcpflags] & (tcp-rst|tcp-fin) != 0'
# Capture HTTP GET request by payload
sudo tcpdump -i eth0 -nn -A 'tcp dst port 80 and tcp[((tcp[12:1] & 0xf0) >> 2):4] = 0x47455420'
# Exclude multicast/broadcast
sudo tcpdump -i eth0 -nn 'not (broadcast or multicast)'File Management – Saving and Rotation
# Save to pcap
sudo tcpdump -i eth0 -w /tmp/capture.pcap port 80
# Rotate every 100 MB, keep 5 files
sudo tcpdump -i eth0 -w /tmp/cap.pcap -C 100 -W 5 port 80
# Limit packet count
sudo tcpdump -i eth0 -c 10000 port 80
# Capture only first 96 bytes (header)
sudo tcpdump -i eth0 -s 96 -w /tmp/head.pcap port 80Performance Impact Control
Low‑traffic ports (<100 Mbps) with simple filters usually consume <1 % CPU and can run continuously.
High‑traffic (>1 Gbps) without filters can use 5‑15 % CPU; always add host/port filters.
When writing to disk at >500 Mbps, monitor I/O and ensure sufficient buffer size (‑B option).
Case Studies
Case 1 – TCP Retransmission Causing API Timeout
# Capture API traffic on port 8080
sudo tcpdump -i eth0 -nn -s 96 -w /tmp/api.pcap port 8080 -c 50000
# Count retransmissions with tshark
tshark -r /tmp/api.pcap -Y "tcp.analysis.retransmission" | wc -l # 87
# Show RTO values
tshark -r /tmp/api.pcap -Y "tcp.analysis.retransmission" -T fields -e frame.time -e ip.src -e ip.dst -e tcp.analysis.rto
# Identify offending IPs
tshark -r /tmp/api.pcap -Y "tcp.analysis.retransmission" -T fields -e ip.dst | sort | uniq -c | sort -rn | head -5
# Verify network quality to that IP
mtr -r -c 100 10.0.0.5 # 2 % loss on hop 3
# Conclusion: upstream switch loss → TCP retransmission → API latencyCase 2 – DNS Resolution Timeout
# Capture DNS traffic
sudo tcpdump -i eth0 -nn port 53 -w /tmp/dns.pcap -c 10000
# Trigger DNS query from the application
curl http://api.example.com/health
# Inspect queries and responses
tshark -r /tmp/dns.pcap -Y "dns" -T fields -e frame.time -e ip.src -e ip.dst -e dns.qry.name -e dns.flags.response | head -20
# Observation: AAAA queries never receive a response, A queries succeed after 5 s timeout.
# Fix: Prefer IPv4 in /etc/gai.conf or add "options single-request-reopen" to /etc/resolv.conf.Case 3 – TLS Handshake Failure
# Capture TLS handshake packets
sudo tcpdump -i eth0 -nn -s 0 -w /tmp/tls.pcap 'port 443 and tcp[((tcp[12:1] & 0xf0) >> 2):1] = 0x16'
# Analyze TLS handshake
tshark -r /tmp/tls.pcap -Y "tls.handshake" -T fields -e frame.time -e ip.src -e ip.dst -e tls.handshake.type -e tls.handshake.version
# Example output shows a fatal alert (bad_certificate)
# Verify certificate chain
openssl s_client -connect 192.168.1.100:443 -servername example.comCase 4 – MySQL Slow Connection
# Capture MySQL handshake (port 3306)
sudo tcpdump -i eth0 -nn -s 0 port 3306 -w /tmp/mysql.pcap -c 5000
# Measure SYN‑ACK latency
tshark -r /tmp/mysql.pcap -Y "tcp.flags.syn==1 && tcp.flags.ack==1" -T fields -e frame.time_relative -e ip.src -e ip.dst
# Observation: 1.2 s gap between Server Greeting and client Login Request.
# Root cause: MySQL performs reverse DNS lookup on client IP.
# Fix: add "skip-name-resolve" to my.cnf.Case 5 – Load‑Balancer Health‑Check Failures
# Capture health‑check traffic from LB (source 10.0.0.1) to backend port 8080
sudo tcpdump -i eth0 -nn src host 10.0.0.1 and port 8080 -w /tmp/hc.pcap -c 1000
# Count SYNs and SYN‑ACKs
tshark -r /tmp/hc.pcap -Y "tcp.flags.syn==1 && tcp.flags.ack==0" | wc -l # 200
tshark -r /tmp/hc.pcap -Y "tcp.flags.syn==1 && tcp.flags.ack==1" | wc -l # 195
# RST packets indicate backend backlog overflow; increase net.core.somaxconn.One‑Click Capture Script
#!/bin/bash
# quick_capture.sh – fast capture and basic analysis
IFACE="$1"
FILTER="$2"
DURATION="${3:-30}"
TS=$(date +%Y%m%d_%H%M%S)
PCAP="/tmp/capture_${TS}.pcap"
echo "=== Quick Capture ==="
echo "Interface: $IFACE"
echo "Filter: $FILTER"
echo "Duration: $DURATION s"
echo "File: $PCAP"
# Capture
timeout "$DURATION" sudo tcpdump -i "$IFACE" -nn -s 0 -w "$PCAP" $FILTER 2>/dev/null &
PID=$!
wait $PID
# Basic stats
echo "--- Protocol distribution ---"
tcpdump -r "$PCAP" -nn 2>/dev/null | awk '{if($0~ /TCP/)proto="TCP"; else if($0~ /UDP/)proto="UDP"; else if($0~ /ICMP/)proto="ICMP"; else proto="Other"; count[proto]++} END{for(p in count)printf " %-10s %d
",p,count[p]}'
# TCP flag summary
echo "--- TCP flag summary ---"
tcpdump -r "$PCAP" -nn 2>/dev/null | grep -oP 'Flags \[\K[^\]]+' | sort | uniq -c | sort -rn
# Retransmission count (requires tshark)
if command -v tshark >/dev/null; then
RETRANS=$(tshark -r "$PCAP" -Y "tcp.analysis.retransmission" 2>/dev/null | wc -l)
echo "--- TCP retransmissions: $RETRANS ---"
fi
echo "Capture saved to $PCAP. Open with Wireshark for deep analysis."Best Practices & Pitfalls
Always add a filter; unrestricted capture can degrade performance.
Limit packet count (‑c) or duration (timeout) to bound resource usage.
Use snaplen (‑s) to capture only headers when payload is unnecessary.
Rotate files (‑C/‑W) to avoid filling disks.
Secure stored pcap files – encrypt or anonymize sensitive fields.
Common errors: wrong interface, missing packets due to Docker bridge, empty pcap caused by filter syntax, performance impact on high‑throughput links.
Container & Kubernetes Capture
For Docker, capture on docker0 or locate the veth interface linked to a container. For Kubernetes, use nsenter, kubectl debug with a netshoot image, or the ksniff plugin.
Monitoring & Metrics
RTT – measured from SYN to SYN‑ACK.
Packet loss – retransmissions / total packets.
RST rate – RST packets per connection.
Zero‑window events – indicate receiver congestion.
System‑level tools (ss, nstat, /proc/net/snmp) and Prometheus node_exporter can expose these metrics. Example alert rules for high TCP retransmission (>2 %) and excessive RST (>5 %) are provided.
Conclusion
tcpdumpprovides a lightweight, kernel‑level packet capture mechanism that, when combined with precise BPF filters, file rotation, and downstream analysis with Wireshark/tshark, enables systematic troubleshooting of a wide range of network anomalies. Integrating capture scripts into monitoring pipelines and following production‑grade best practices ensures minimal performance impact while delivering the visibility needed to resolve latency, connectivity, and security issues.
References
tcpdump official manual: https://www.tcpdump.org/manpages/tcpdump.1.html
BPF filter syntax: https://www.tcpdump.org/manpages/pcap-filter.7.html
Wireshark user guide: https://www.wireshark.org/docs/wsug_html_chunked/
TCP/IP Illustrated, Volume 1, Chapters 18‑21
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
