Operations 39 min read

Using tcpdump to Pinpoint Online Network Anomalies

This guide explains how tcpdump, built on libpcap and the kernel BPF filter, can capture packets at the network stack, compares it with Wireshark, shows practical filtering syntax, performance considerations, typical use‑cases such as TCP retransmission, DNS timeouts and TLS handshake failures, and provides scripts and best‑practice recommendations for production‑grade troubleshooting.

Raymond Ops
Raymond Ops
Raymond Ops
Using tcpdump to Pinpoint Online Network Anomalies

Overview

The ultimate method for diagnosing network faults is packet capture. When logs and coarse metrics provide no clues, tcpdump can deliver deterministic evidence by copying packets directly from the kernel.

Packet Capture Principle

Network card receives packet
   |
Kernel network stack
   |
BPF filter (tcpdump rule)
   |
Match? – No → packet is dropped by the kernel (no impact on normal traffic)
   |
Yes → copy to user‑space buffer
   |
<code>tcpdump</code> reads and formats the output

BPF runs in kernel space, so filtering is highly efficient and does not affect normal traffic. tcpdump only mirrors packets; it never modifies or blocks the original flow.

Capture occurs before firewall processing, allowing visibility of packets dropped by INPUT rules.

tcpdump vs Wireshark

tcpdump : command‑line, ideal for servers, real‑time capture and saving to pcap files.

Wireshark : graphical, provides deep protocol analysis, stream tracking and statistics.

Capture Point in the Network Stack

Application Layer (HTTP/MySQL/DNS …)
   |
Transport Layer (TCP/UDP)
   |
Network Layer (IP)
   |
┌───────────────────────────────────────┐
│            tcpdump capture point      │
│ ← inbound: before iptables INPUT chain │
│ → outbound: after iptables OUTPUT chain│
└───────────────────────────────────────┘
   |
Data‑link Layer (Ethernet)
   |
Physical Layer (NIC)

tcpdump can see inbound packets that are later DROPped by the firewall.

It cannot see outbound packets dropped by the OUTPUT chain because they never reach the capture point.

If a SYN is seen but no SYN‑ACK, the problem lies beyond the host or in an intermediate link.

Applicable Scenarios

TCP connection timeouts, retransmissions, RST events.

DNS resolution verification.

SSL/TLS handshake failures.

Application‑layer protocol anomalies (HTTP headers, MySQL protocol, etc.).

Packet loss, out‑of‑order delivery.

Firewall rule validation.

Load‑balancer health‑check debugging.

Environment Requirements

Ubuntu 24.04 LTS / Rocky Linux 9.5 with kernel 6.12+. tcpdump ≥ 4.99, libpcap ≥ 1.10.

Optional: Wireshark/tshark ≥ 4.x, termshark ≥ 2.x.

# Ubuntu 24.04 installation
sudo apt install -y tcpdump tshark

# Rocky Linux 9.5 installation
sudo dnf install -y tcpdump wireshark-cli

# Verify versions
tcpdump --version   # tcpdump version 4.99.x
libpcap --version   # libpcap version 1.10.x

# Allow non‑root capture (optional)
sudo setcap cap_net_raw,cap_net_admin=eip /usr/bin/tcpdump

Detailed Steps

Interface Selection

# List interfaces
tcpdump -D
# Capture on eth0
sudo tcpdump -i eth0
# Capture on all interfaces
sudo tcpdump -i any
# Disable DNS reverse lookup (recommended)
sudo tcpdump -i eth0 -nn

Protocol Filtering

# TCP only
sudo tcpdump -i eth0 tcp
# UDP only
sudo tcpdump -i eth0 udp
# ICMP only
sudo tcpdump -i eth0 icmp
# ARP only
sudo tcpdump -i eth0 arp

Port Filtering

# Specific port
sudo tcpdump -i eth0 port 80
# Source port
sudo tcpdump -i eth0 src port 80
# Destination port range
sudo tcpdump -i eth0 portrange 8080-8090
# Combine ports
sudo tcpdump -i eth0 port 80 or port 443

IP Filtering

# Specific host
sudo tcpdump -i eth0 host 192.168.1.100
# Source IP
sudo tcpdump -i eth0 src host 192.168.1.100
# Network segment
sudo tcpdump -i eth0 net 192.168.1.0/24

BPF Expression Details

BPF (Berkeley Packet Filter) is the filtering language used by tcpdump. It combines primitives (type, direction, protocol) with logical operators.

Common Quick Filters

# Exclude SSH traffic
sudo tcpdump -i eth0 -nn not port 22
# Capture only new connections (SYN)
sudo tcpdump -i eth0 -nn 'tcp[tcpflags] == tcp-syn'
# Capture only abnormal packets (RST or FIN)
sudo tcpdump -i eth0 -nn 'tcp[tcpflags] & (tcp-rst|tcp-fin) != 0'
# Capture HTTP GET request by payload
sudo tcpdump -i eth0 -nn -A 'tcp dst port 80 and tcp[((tcp[12:1] & 0xf0) >> 2):4] = 0x47455420'
# Exclude multicast/broadcast
sudo tcpdump -i eth0 -nn 'not (broadcast or multicast)'

File Management – Saving and Rotation

# Save to pcap
sudo tcpdump -i eth0 -w /tmp/capture.pcap port 80
# Rotate every 100 MB, keep 5 files
sudo tcpdump -i eth0 -w /tmp/cap.pcap -C 100 -W 5 port 80
# Limit packet count
sudo tcpdump -i eth0 -c 10000 port 80
# Capture only first 96 bytes (header)
sudo tcpdump -i eth0 -s 96 -w /tmp/head.pcap port 80

Performance Impact Control

Low‑traffic ports (<100 Mbps) with simple filters usually consume <1 % CPU and can run continuously.

High‑traffic (>1 Gbps) without filters can use 5‑15 % CPU; always add host/port filters.

When writing to disk at >500 Mbps, monitor I/O and ensure sufficient buffer size (‑B option).

Case Studies

Case 1 – TCP Retransmission Causing API Timeout

# Capture API traffic on port 8080
sudo tcpdump -i eth0 -nn -s 96 -w /tmp/api.pcap port 8080 -c 50000
# Count retransmissions with tshark
tshark -r /tmp/api.pcap -Y "tcp.analysis.retransmission" | wc -l   # 87
# Show RTO values
tshark -r /tmp/api.pcap -Y "tcp.analysis.retransmission" -T fields -e frame.time -e ip.src -e ip.dst -e tcp.analysis.rto
# Identify offending IPs
tshark -r /tmp/api.pcap -Y "tcp.analysis.retransmission" -T fields -e ip.dst | sort | uniq -c | sort -rn | head -5
# Verify network quality to that IP
mtr -r -c 100 10.0.0.5   # 2 % loss on hop 3
# Conclusion: upstream switch loss → TCP retransmission → API latency

Case 2 – DNS Resolution Timeout

# Capture DNS traffic
sudo tcpdump -i eth0 -nn port 53 -w /tmp/dns.pcap -c 10000
# Trigger DNS query from the application
curl http://api.example.com/health
# Inspect queries and responses
tshark -r /tmp/dns.pcap -Y "dns" -T fields -e frame.time -e ip.src -e ip.dst -e dns.qry.name -e dns.flags.response | head -20
# Observation: AAAA queries never receive a response, A queries succeed after 5 s timeout.
# Fix: Prefer IPv4 in /etc/gai.conf or add "options single-request-reopen" to /etc/resolv.conf.

Case 3 – TLS Handshake Failure

# Capture TLS handshake packets
sudo tcpdump -i eth0 -nn -s 0 -w /tmp/tls.pcap 'port 443 and tcp[((tcp[12:1] & 0xf0) >> 2):1] = 0x16'
# Analyze TLS handshake
tshark -r /tmp/tls.pcap -Y "tls.handshake" -T fields -e frame.time -e ip.src -e ip.dst -e tls.handshake.type -e tls.handshake.version
# Example output shows a fatal alert (bad_certificate)
# Verify certificate chain
openssl s_client -connect 192.168.1.100:443 -servername example.com

Case 4 – MySQL Slow Connection

# Capture MySQL handshake (port 3306)
sudo tcpdump -i eth0 -nn -s 0 port 3306 -w /tmp/mysql.pcap -c 5000
# Measure SYN‑ACK latency
tshark -r /tmp/mysql.pcap -Y "tcp.flags.syn==1 && tcp.flags.ack==1" -T fields -e frame.time_relative -e ip.src -e ip.dst
# Observation: 1.2 s gap between Server Greeting and client Login Request.
# Root cause: MySQL performs reverse DNS lookup on client IP.
# Fix: add "skip-name-resolve" to my.cnf.

Case 5 – Load‑Balancer Health‑Check Failures

# Capture health‑check traffic from LB (source 10.0.0.1) to backend port 8080
sudo tcpdump -i eth0 -nn src host 10.0.0.1 and port 8080 -w /tmp/hc.pcap -c 1000
# Count SYNs and SYN‑ACKs
tshark -r /tmp/hc.pcap -Y "tcp.flags.syn==1 && tcp.flags.ack==0" | wc -l   # 200
tshark -r /tmp/hc.pcap -Y "tcp.flags.syn==1 && tcp.flags.ack==1" | wc -l   # 195
# RST packets indicate backend backlog overflow; increase net.core.somaxconn.

One‑Click Capture Script

#!/bin/bash
# quick_capture.sh – fast capture and basic analysis
IFACE="$1"
FILTER="$2"
DURATION="${3:-30}"
TS=$(date +%Y%m%d_%H%M%S)
PCAP="/tmp/capture_${TS}.pcap"

echo "=== Quick Capture ==="
echo "Interface: $IFACE"
echo "Filter: $FILTER"
echo "Duration: $DURATION s"
echo "File: $PCAP"

# Capture
timeout "$DURATION" sudo tcpdump -i "$IFACE" -nn -s 0 -w "$PCAP" $FILTER 2>/dev/null &
PID=$!
wait $PID

# Basic stats
echo "--- Protocol distribution ---"
 tcpdump -r "$PCAP" -nn 2>/dev/null | awk '{if($0~ /TCP/)proto="TCP"; else if($0~ /UDP/)proto="UDP"; else if($0~ /ICMP/)proto="ICMP"; else proto="Other"; count[proto]++} END{for(p in count)printf " %-10s %d
",p,count[p]}'

# TCP flag summary
echo "--- TCP flag summary ---"
 tcpdump -r "$PCAP" -nn 2>/dev/null | grep -oP 'Flags \[\K[^\]]+' | sort | uniq -c | sort -rn

# Retransmission count (requires tshark)
if command -v tshark >/dev/null; then
  RETRANS=$(tshark -r "$PCAP" -Y "tcp.analysis.retransmission" 2>/dev/null | wc -l)
  echo "--- TCP retransmissions: $RETRANS ---"
fi

echo "Capture saved to $PCAP. Open with Wireshark for deep analysis."

Best Practices & Pitfalls

Always add a filter; unrestricted capture can degrade performance.

Limit packet count (‑c) or duration (timeout) to bound resource usage.

Use snaplen (‑s) to capture only headers when payload is unnecessary.

Rotate files (‑C/‑W) to avoid filling disks.

Secure stored pcap files – encrypt or anonymize sensitive fields.

Common errors: wrong interface, missing packets due to Docker bridge, empty pcap caused by filter syntax, performance impact on high‑throughput links.

Container & Kubernetes Capture

For Docker, capture on docker0 or locate the veth interface linked to a container. For Kubernetes, use nsenter, kubectl debug with a netshoot image, or the ksniff plugin.

Monitoring & Metrics

RTT – measured from SYN to SYN‑ACK.

Packet loss – retransmissions / total packets.

RST rate – RST packets per connection.

Zero‑window events – indicate receiver congestion.

System‑level tools (ss, nstat, /proc/net/snmp) and Prometheus node_exporter can expose these metrics. Example alert rules for high TCP retransmission (>2 %) and excessive RST (>5 %) are provided.

Conclusion

tcpdump

provides a lightweight, kernel‑level packet capture mechanism that, when combined with precise BPF filters, file rotation, and downstream analysis with Wireshark/tshark, enables systematic troubleshooting of a wide range of network anomalies. Integrating capture scripts into monitoring pipelines and following production‑grade best practices ensures minimal performance impact while delivering the visibility needed to resolve latency, connectivity, and security issues.

References

tcpdump official manual: https://www.tcpdump.org/manpages/tcpdump.1.html

BPF filter syntax: https://www.tcpdump.org/manpages/pcap-filter.7.html

Wireshark user guide: https://www.wireshark.org/docs/wsug_html_chunked/

TCP/IP Illustrated, Volume 1, Chapters 18‑21

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performance monitoringLinuxnetwork analysispacket captureWiresharkBPFtcpdump
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.