How Do Packets Flow Inside and Outside Kubernetes? A Deep Dive into Pod Networking
This article explains how Kubernetes forwards packets from the initial web request through container networking, covering the Kubernetes network model, Linux network namespaces, the role of the pause container, pod‑to‑pod communication on the same and different nodes, CNI plugins, and how services use Netfilter and iptables to rewrite traffic.
Through this article you will understand how packets are forwarded inside and outside Kubernetes, starting from a raw web request to the containers that run the application.
Linux Network Namespaces in Pods
Kubernetes defines a network model with three basic rules: Pods can communicate with any other Pod without NAT, processes on a node can communicate with any Pod on that node without NAT, and each Pod has its own IP address that other Pods can use.
To illustrate, consider a Pod with an nginx and a busybox container:
apiVersion: v1
kind: Pod
metadata:
name: multi-container-Pod
spec:
containers:
- name: container-1
image: busybox
command: ['/bin/sh', '-c', 'sleep 1d']
- name: container-2
image: nginxWhen the Pod is created:
Pod receives an independent network namespace on the node.
An IP address is assigned to the Pod and the containers share the same ports.
Both containers share the same network namespace and can see each other locally.
Linux network namespaces are isolated logical spaces that can be thought of as sliced pieces of a physical network interface, each with its own firewall rules, interfaces, routing, and other network resources.
The ip netns list command can list namespaces on the host, and you will see entries such as cni-xxxx created by CNI plugins.
When a Pod is created, the container runtime (containerd or CRI‑O) creates the network namespace before any containers start; manual ip netns commands are not required.
Pause Container Creates the Network Namespace
Every Pod has an additional hidden container called the pause container. It runs a minimal sleep process and is responsible for creating and holding the Pod’s network namespace.
Listing containers on a node shows a pause container for each Pod, e.g. k8s.gcr.io/pause:3.4.1. The pause container creates the namespace, and all other containers in the Pod join that namespace.
Pod‑to‑Pod Traffic
Pod‑to‑Pod communication can happen in two scenarios: both Pods on the same node or on different nodes.
On the same node, a virtual Ethernet (veth) pair connects the Pod’s namespace to the root namespace, and an Ethernet bridge in the root namespace links all veth interfaces together. The bridge acts as a virtual switch, using ARP to resolve MAC addresses and forward frames.
When Pods are on different nodes, the packet is first forwarded to the node’s default gateway after the bridge stage. The gateway then routes the packet to the destination node, where the same bridge and veth mechanisms deliver it to the target Pod.
Bitwise AND operations are used to determine whether a destination IP is in the local subnet; if not, the packet is sent to the default gateway.
Container Network Interface (CNI)
CNI plugins implement the networking requirements of Kubernetes. Common plugins include Calico, Cilium, Flannel, Weave Net, and others. They perform actions such as creating interfaces, veth pairs, setting up network namespaces, configuring static routes, bridges, IP allocation, and NAT rules.
CNI defines four operations: ADD (add a container to the network), DEL (remove a container), CHECK (verify network health), and VERSION (show plugin version).
For example, a Calico configuration file ( /etc/cni/net.d/10-calico.conflist) specifies the plugin type, IPAM settings, and additional capabilities like bandwidth limiting and port mapping.
Pod‑to‑Service Traffic
Services provide a stable virtual IP (VIP) that abstracts the dynamic Pod IPs. When a Pod sends traffic to a Service VIP, Netfilter’s NF_IP_PRE_ROUTING hook triggers iptables rules that DNAT the packet’s destination IP to the selected backend Pod’s IP.
After routing, the packet reaches the backend Pod. When the backend Pod replies, conntrack records the flow, and iptables performs SNAT so the response appears to come from the Service VIP, not the backend Pod.
iptables chains involved are PRE_ROUTING, INPUT, FORWARD, OUTPUT, and POST_ROUTING, each mapped to Netfilter hooks. You can view the current rules with iptables-save.
Summary
How containers communicate locally within a Pod.
Pod‑to‑Pod communication on the same node and across nodes.
Pod‑to‑Service traffic and how Service VIPs are rewritten using Netfilter and iptables.
Key concepts: network namespaces, veth pairs, bridges, iptables chains, conntrack, Netfilter, CNI plugins, and overlay networks.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
