Operations 18 min read

10 Real‑World TCPDump Cases That Reveal Hidden Network Issues

This guide walks senior operations engineers through ten authentic production‑level network problems, demonstrating how to capture and analyze packets with TCPDump, interpret key TCP/IP indicators, and apply concrete fixes ranging from firewall rules to load‑balancer health‑check settings.

Liangxu Linux
Liangxu Linux
Liangxu Linux
10 Real‑World TCPDump Cases That Reveal Hidden Network Issues

Why TCPDump Is a Must‑Have Tool for Operations

When alerts wake you at 3 am, users complain about latency, or developers blame the network, TCPDump records every packet truthfully, letting you pinpoint the root cause without guesswork.

What You’ll Learn

10 real‑world troubleshooting cases

Essential TCPDump parameters and filters

Golden rules for packet‑level analysis

A systematic methodology for rapid problem isolation

Case 1: Mysterious Connection Timeout – TCP Three‑Way Handshake Failure

Problem

# User feedback: frequent API timeouts
curl: (7) Failed to connect to api.example.com port 443: Connection timed out

Capture

tcpdump -i eth0 -nn -s0 -w timeout.pcap host api.example.com

Analysis

10:30:15.123456 IP 192.168.1.100.45678 > 203.0.113.10.443: Flags [S], seq 1000, win 65535
10:30:18.123456 IP 192.168.1.100.45678 > 203.0.113.10.443: Flags [S], seq 1000, win 65535
10:30:24.123456 IP 192.168.1.100.45678 > 203.0.113.10.443: Flags [S], seq 1000, win 65535

Key finding: Only SYN packets are seen – no SYN‑ACK replies.

Solution

Inspect firewall rules; the outbound port 443 rule had been mistakenly removed, preventing the server from receiving SYN‑ACK packets.

Ops tip: 80 % of network issues hide in firewall misconfigurations; TCPDump exposes them instantly.

Case 2: Strange Slow Query – TCP Window‑Scaling Problem

Problem

Intermittent database queries take up to 30 seconds while CPU and memory look normal.

Capture

# Capture MySQL traffic only
tcpdump -i any -nn -s0 port 3306 and host 192.168.1.50 -w mysql_slow.pcap

Analysis

# Normal
10:45:01 IP client.45678 > mysql.3306: win 65535
# Abnormal
10:45:02 IP client.45678 > mysql.3306: win 0
10:45:02 IP mysql.3306 > client.45678: win 32768 [window probe]

Finding: The client’s receive window drops to 0, triggering zero‑window probes and halting data flow.

Root Cause

The application processes MySQL result sets too slowly, exhausting the receive buffer.

Case 3: Load‑Balancer Hidden Killer – RST Packet Tracing

Problem

Backend connections are frequently reset; logs show “Connection reset by peer”.

Capture

# Capture only RST packets
tcpdump -i eth0 -nn 'tcp[tcpflags] & tcp-rst != 0' -c 100

Analysis Steps

Identify which side (client or server) sends the RST.

Determine when the RST occurs (during data transfer or immediately after connection setup).

Check sequence numbers to see if the RST is legitimate.

Final discovery: The load‑balancer’s health‑check timeout was too short, causing healthy connections to be killed.

Case 4: HTTP Request Vanishes – Application‑Layer Analysis

Problem

POST requests succeed only 70 % of the time; GET requests work fine.

Capture

# Capture HTTP traffic on port 8080
tcpdump -i eth0 -A -s0 port 8080 -w http_post.pcap
# Filter POST requests
tcpdump -r http_post.pcap -A | grep -i "post"

Findings

Successful POST includes correct Content‑Length: 256; failed POST shows Content‑Length: 512 and mismatched payload.

Root Cause

Nginx’s client_max_body_size limit was exceeded, but the error log level hid the warning.

Case 5: DNS Resolution Trap

Problem

Service works for a few minutes after startup, then DNS lookups fail until a restart.

Capture

# Capture DNS queries
tcpdump -i eth0 -nn port 53 -w dns_issue.pcap

Analysis

# Normal query
12:30:01 IP 192.168.1.100.12345 > 8.8.8.8.53: 12345+ A? api.example.com
12:30:01 IP 8.8.8.8.53 > 192.168.1.100.12345: 12345 1/0/0 A 203.0.113.10
# Abnormal query – no reply
12:35:01 IP 192.168.1.100.12346 > 8.8.8.8.53: 12346+ A? api.example.com

Root cause: The DNS server had dual‑stack (IPv4/IPv6) configuration, but IPv6 routing was broken, causing intermittent failures.

Case 6: SSL Handshake Dark Moment

Problem

HTTPS endpoints intermittently return “ERR_SSL_PROTOCOL_ERROR”.

Capture

# Capture the TLS handshake
tcpdump -i eth0 -nn -s0 port 443 and host web.example.com -w ssl_handshake.pcap

Analysis

# Normal flow
Client Hello -> Server Hello -> Certificate -> Server Hello Done -> Client Key Exchange -> Change Cipher Spec -> Finished
# Abnormal flow – connection drops at Certificate stage

Deep dive: The server’s certificate chain missed an intermediate CA, causing some clients to abort the handshake.

Case 7: Microservice Communication Hazard

Problem

Service A calls Service B with 95 % success; the remaining 5 % failures are random.

Capture

# Capture traffic between the two services on any interface
tcpdump -i any -nn -s0 '(src host serviceA and dst host serviceB) or (src host serviceB and dst host serviceA)' -w microservice.pcap

Findings

# Normal connection
192.168.1.10.8080 -> 192.168.1.20.9090: established
# Abnormal connection – port reuse, mixed sequence numbers
192.168.1.10.8080 -> 192.168.1.20.9090: [port reused, sequence chaos]

Root cause: Service B’s restart left sockets in TIME_WAIT, leading to port‑reuse collisions.

Case 8: Bandwidth Saturation Truth

Problem

Server bandwidth spikes to 95 % without a corresponding traffic surge.

Capture

# List top talkers by packet count
tcpdump -i eth0 -nn -q | head -1000 | awk '{print $3}' | sort | uniq -c | sort -nr
# Capture the heavy‑hitter traffic
tcpdump -i eth0 -nn -s0 src 192.168.1.100 -w bandwidth_hog.pcap

Discovery: A single internal IP repeatedly performed health‑check requests every 1 ms due to a misconfigured load‑balancer.

Solution

Adjust the health‑check interval; bandwidth usage returned to normal.

Case 9: Packet Mutation Mystery

Problem

Clients send correct payloads, but servers receive truncated or garbled data.

Capture

# Capture on the gateway between client and server
tcpdump -i eth0 -xx -s0 host client.ip and host server.ip -w packet_corruption.pcap

Analysis

# Client packet (correct)
45 00 05 dc ... [full packet]
# Server packet (corrupted)
45 00 05 dc ... [modified by middle device]

Culprit: A switch firmware bug recalculated checksums incorrectly for a specific packet pattern.

Case 10: Latency Analysis – Time Is Money

Problem

API P99 latency reaches 5 s, while the underlying DB query finishes in 50 ms.

Capture

# Precise timestamp capture on port 8080
tcpdump -i eth0 -ttt -nn port 8080 -w latency_analysis.pcap

Decomposition

# TCP connection time: 150 ms
# SSL handshake time: 300 ms
# HTTP request processing: 50 ms
# Network transmission time: 4500 ms ← problem area!

Final root cause: The outbound gateway’s QoS policy mistakenly marked API traffic as low priority, inflating transmission delay.

Practical TCPDump Tips

Golden Parameter Combination

# Universal capture command
tcpdump -i any -nn -s0 -w capture.pcap
# Explanation
-i any      # listen on all interfaces
-nn          # don’t resolve names
-s0          # capture full packet
-w file      # write to file

Filter Magic

# Host filter
tcpdump host 192.168.1.100
# Port filter
tcpdump port 80 or port 443
# Protocol filter
tcpdump tcp and not ssh
# Flag filter (SYN packets)
 tcpdump 'tcp[tcpflags] & tcp-syn != 0'
# Combined filter example
tcpdump -i eth0 -nn 'host 192.168.1.100 and (port 80 or port 443) and tcp[tcpflags] & tcp-syn != 0'

Analysis Framework

Four‑step network troubleshooting:

Describe the symptom and collect logs.

Form a hypothesis and design a capture plan.

Deep‑dive into the packet data to spot anomalies.

Confirm the root cause and implement a fix.

Five analysis dimensions:

Connection layer – handshake and teardown.

Transport layer – sequence numbers, ACKs, window size.

Application layer – HTTP status, TLS handshake.

Time layer – latency distribution, timeout settings.

Statistics layer – retransmission rate, packet loss, connection count.

Advanced Skills

1. Automated Diagnosis Script

#!/bin/bash
# Quick network diagnosis
echo "Starting network diagnosis..."
ping -c 4 $1 > /tmp/ping.log 2>&1
timeout 30 tcpdump -i any -nn -c 100 host $1 -w /tmp/capture.pcap 2>/dev/null
echo "=== Connectivity Test ==="
cat /tmp/ping.log
echo "=== Packet Stats ==="
tcpdump -r /tmp/capture.pcap -nn | head -10

2. Monitoring Integration

Example Zabbix trigger that launches TCPDump when error rate exceeds 5 %:

if [ $NETWORK_ERROR_RATE -gt 5 ]; then
    tcpdump -i eth0 -G 300 -W 2 -w /var/log/auto_capture_%Y%m%d_%H%M%S.pcap &
    echo "Automatic capture started"
fi

3. Large‑Scale Production Tips

Use ring buffers to avoid disk exhaustion.

Apply precise filters to reduce CPU load.

Capture on a network‑mirroring port to keep the production path untouched.

Implement archiving and cleanup policies for pcap files.

Conclusion – Mastering the Core Weapon of Network Diagnosis

For operations engineers, network issues are inevitable, but TCPDump turns raw packets into undeniable evidence. By mastering the commands, filters, and systematic analysis steps presented here, you can quickly locate the 70 % of problems that hide in the network stack and resolve them with confidence.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Case Studynetwork troubleshootingpacket analysistcpdump
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.