Cloud Native 28 min read

How Do Packets Travel Inside and Outside Kubernetes? A Deep Dive into Pods, Network Namespaces, and CNI

This article explains how Kubernetes forwards packets from the initial web request through container networking, covering the network model, pod creation steps, the role of the pause container, same‑node and cross‑node pod‑to‑pod traffic, service IP translation, and the underlying CNI, iptables, and conntrack mechanisms.

Cloud Native Technology Community

Nov 16, 2022

How Do Packets Travel Inside and Outside Kubernetes? A Deep Dive into Pods, Network Namespaces, and CNI

Kubernetes Network Model

Kubernetes requires each Pod to have a unique IP address, to be reachable from any other Pod without NAT, and to allow processes on a node to reach Pods on the same node. The container runtime creates a network namespace for the Pod and a pause container establishes that namespace. All containers in the Pod share the same namespace and therefore the same IP address.

Pod Network Namespace Basics

When a multi‑container Pod is created, the node allocates a separate network namespace, creates a veth pair, and assigns a single IP address to the interface eth0 inside the namespace. The example manifest below creates a Pod with a busybox container and an nginx container:

apiVersion: v1
kind: Pod
metadata:
  name: multi-container-pod
spec:
  containers:
  - name: busybox
    image: busybox
    command: ["/bin/sh", "-c", "sleep 1d"]
  - name: nginx
    image: nginx

During creation:

The pod receives its own network namespace.

An interface eth0 is created and the pod IP is assigned.

Both containers share the namespace and can reach each other via localhost.

Listing namespaces on a node shows entries such as cni-xxxx created by the CNI plugin:

$ ip netns list
cni-0f226515-e28b-df13-9f16-dd79456825ac (id: 3)
...

Inspecting the pod namespace reveals the interface state:

$ ip netns exec cni-0f226515-e28b-df13-9f16-dd79456825ac ip a
3: eth0@if12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 ... inet 10.244.4.40/32 ...

Pause Container

The pause container is a minimal sleeping process that holds the network namespace open. Every Pod has exactly one pause container; other containers join the namespace created by it. The runtime (usually containerd or CRI‑O) creates the namespace before any user containers start, so manual ip netns commands are not required.

Pod‑to‑Pod Communication

Communication follows two paths:

Same‑node traffic: The packet leaves the pod via eth0, traverses the veth pair into the node’s root namespace, and the Linux bridge forwards it to the destination pod’s veth after ARP resolves the MAC address.

Cross‑node traffic: After reaching the root namespace, the packet’s destination IP is outside the node’s CIDR, so the node forwards it to its default gateway. The remote node repeats the bridge/veth steps to deliver the packet to the target pod.

Bitwise Routing Decision

The node determines whether a destination IP is local by performing a bitwise AND between the source IP and the node’s subnet mask and comparing the result with the destination network. If the networks differ, the packet is sent to the default gateway.

Container Network Interface (CNI)

CNI plugins implement the network model. Popular plugins include Calico, Cilium, Flannel, and Weave Net. A CNI plugin must support four commands: ADD – attach a container to the network. DEL – detach a container. CHECK – verify the container’s network. VERSION – report the plugin version.

When a pod is scheduled, kubelet passes a JSON configuration to the CNI plugin, which then:

Creates the network namespace (via the pause container).

Creates a veth pair and moves one end into the pod namespace.

Assigns an IP address from the cluster pool.

Sets up routing, bridge attachment, and optional NAT rules.

Example Calico CNI configuration (found in /etc/cni/net.d/10-calico.conflist)

{
  "name": "k8s-pod-network",
  "cniVersion": "0.3.1",
  "plugins": [
    {
      "type": "calico",
      "datastore_type": "kubernetes",
      "mtu": 0,
      "log_level": "Info",
      "ipam": {"type": "calico-ipam", "assign_ipv4": "true"},
      "policy": {"type": "k8s"},
      "kubernetes": {
        "k8s_api_root": "https://10.96.0.1:443",
        "kubeconfig": "/etc/cni/net.d/calico-kubeconfig"
      }
    },
    {"type": "bandwidth", "capabilities": {"bandwidth": true}},
    {"type": "portmap", "snat": true, "capabilities": {"portMappings": true}}
  ]
}

veth Pairs and Linux Bridge

Each pod gets a veth pair: one end ( veth0) lives inside the pod namespace, the peer ( veth1) resides in the node’s root namespace and is attached to the node’s Linux bridge (e.g., cbr0 or weave). The bridge acts as a virtual switch, forwarding frames between all connected veth interfaces.

Manual creation (for illustration) would look like:

# Create a veth pair and move one end into the pod namespace
ip link add veth-pod netns $POD_NS type veth peer name veth-host
ip netns exec $POD_NS ip link set veth-pod up
ip link set veth-host up
# Attach the host end to the bridge
brctl addif cbr0 veth-host

Service IP Translation (iptables & conntrack)

Kubernetes Services expose a stable virtual IP (ClusterIP). When a packet destined for a Service IP arrives, the PREROUTING chain of Netfilter (via iptables) performs DNAT to rewrite the destination IP to the IP of a selected backend pod. Conntrack records this mapping so that the return traffic is SNAT‑ed back to the Service IP.

Typical iptables rule created for a Service:

# Example rule for Service 10.96.0.1
iptables -t nat -A KUBE-SERVICES -d 10.96.0.1/32 -j KUBE-SVC-XYZ

After DNAT, the packet follows the same pod‑to‑pod path described earlier. When the backend pod replies, conntrack rewrites the source IP (SNAT) to the original Service IP, making the client see the response as coming from the Service.

Verifying Pod Networking

Retrieve a pod’s IP:

$ kubectl get pod multi-container-pod -o jsonpath={.status.podIP}
10.244.4.40

Find the corresponding network namespace on the node (e.g., cni-0f226515-e28b-df13-9f16-dd79456825ac) and inspect the interface:

$ ip netns exec cni-0f226515-e28b-df13-9f16-dd79456825ac ip a
3: eth0@if12: <BROADCAST,MULTICAST,UP,LOWER_UP> ... inet 10.244.4.40/32 ...

Check that the nginx container is listening on port 80 inside the pod namespace:

$ ip netns exec cni-0f226515-e28b-df13-9f16-dd79456825ac netstat -lnp
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      692698/nginx: master

Key Takeaways

Each Pod gets a dedicated network namespace created by the pause container; all containers in the Pod share the same eth0 and IP address.

Pod‑to‑Pod traffic uses a veth pair and a Linux bridge; same‑node traffic is resolved via ARP, cross‑node traffic is routed through the node’s default gateway.

The node decides local vs. remote destinations with a bitwise AND of IP and subnet mask.

CNI plugins automate namespace creation, veth setup, IP allocation, routing, and optional NAT.

Kubernetes Services provide a stable virtual IP; iptables DNAT rewrites the destination to a backend pod, and conntrack SNAT rewrites the source on the return path.

Kubernetes Service iptables CNI conntrack Network Namespace Pod Communication

Written by

Cloud Native Technology Community

The Cloud Native Technology Community, part of the CNBPA Cloud Native Technology Practice Alliance, focuses on evangelizing cutting‑edge cloud‑native technologies and practical implementations. It shares in‑depth content, case studies, and event/meetup information on containers, Kubernetes, DevOps, Service Mesh, and other cloud‑native tech, along with updates from the CNBPA alliance.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.