Operations 27 min read

How to Diagnose Kubernetes Pod Network Issues: Tools, Models, and Real-World Cases

This article introduces a systematic approach for troubleshooting Kubernetes pod network problems, covering anomaly classification, essential diagnostic tools such as tcpdump, nsenter, paping and mtr, a step‑by‑step troubleshooting workflow, and detailed case studies that illustrate root‑cause analysis and resolution techniques.

Open Source Linux

Jul 31, 2023

How to Diagnose Kubernetes Pod Network Issues: Tools, Models, and Real-World Cases

1. Pod Network Anomalies

The article classifies pod network problems into four main categories: network unreachable (ping fails), port unreachable (ping works but telnet fails), DNS resolution failure (domain name cannot be resolved while IP works), and large‑packet loss (small packets succeed, large packets are dropped).

Network unreachable causes include firewall rules, incorrect routing, overloaded host/network interfaces, and link failures.

Port unreachable causes include firewall restrictions, high connection counts, and services not listening on the expected port.

DNS resolution failures stem from misconfigured pod DNS, DNS service issues, or communication problems with the DNS server.

Large‑packet loss can be diagnosed with ping -s and often results from MTU mismatches in Docker, CNI plugins, or host NICs.

CNI anomalies (e.g., kube‑proxy failure, CIDR exhaustion) also affect pod connectivity.

The overall classification is illustrated in an accompanying diagram.

2. Common Network Troubleshooting Tools

After understanding typical anomalies, the article introduces several essential tools.

tcpdump

A powerful packet sniffer that can capture traffic on any interface and filter by host, IP, network, port, packet size, and TCP flags.

tcpdump -D

tcpdump host 1.1.1.1

tcpdump src|dst 1.1.1.1

tcpdump net 1.2.3.0/24

tcpdump -c 1 -X icmp

tcpdump port 3389

tcpdump portrange 21-23

tcpdump less 32

tcpdump -w capture_file

Logical operators (and, or, not) can combine filters, e.g.:

tcpdump -i eth0 -nn host 220.181.57.216 and 10.0.0.1

nsenter

Enters the network namespace of a running container, useful when the container lacks tools like sudo or netstat.

nsenter -t <pid> -n <command>

paping

Continuously pings a TCP port to test connectivity and packet loss.

paping -p 80 -c 10 192.168.1.1

mtr

Combines traceroute and ping, providing loss percentage, latency statistics, and per‑hop analysis.

mtr google.com

mtr -n google.com

mtr -b google.com

Key columns: last, avg, best, wrst, stdev. High loss% or stdev indicates problematic hops.

3. Pod Network Troubleshooting Process

The article presents a flowchart for diagnosing pod network issues, guiding users from symptom identification to tool selection and root‑cause isolation.

4. Case Studies

Node Expansion Causes Service Inaccessibility

After adding a new worker node, the node could not reach a registry service via its ClusterIP, while other nodes could. Investigation showed kube‑proxy and iptables were healthy, but the new node had two IP addresses (static + DHCP) on the same NIC, causing NAT mismatches. Removing the DHCP configuration resolved the issue.

External Cloud Host Times Out When Calling Cluster Service

A cloud VM could telnet to a NodePort service but HTTP requests timed out. Packet captures revealed successful TCP handshake but large packets (≈1514 B) were repeatedly retransmitted. The root cause was an MTU mismatch: the host used MTU 1500 while the Calico tunnel interface used MTU 1440. Aligning MTU values fixed the problem.

Pod Fails to Access Object Storage by Domain Name

Pods could reach the storage IP but DNS resolution failed. The cluster’s DNS pods were pending because kube‑proxy pods lacked the highest priority class and were evicted under resource pressure. Assigning system-node-critical priority to kube‑proxy and adding readiness probes for application pods restored DNS functionality.

These examples demonstrate how systematic classification, appropriate tooling, and careful inspection of CNI, kube‑proxy, iptables, and DNS components can quickly pinpoint and resolve Kubernetes network anomalies.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

kubernetes network troubleshooting iptables CNI tcpdump

Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.