How Linux Sends a Packet: From Process to NIC and the Key Metrics to Watch
The article walks through the Linux packet lifecycle—from the send() system call, through the transport and network layers, to the NIC driver—explaining each step, virtual‑network abstractions, and the essential bandwidth, latency, loss, conntrack, and socket buffer metrics to monitor when problems arise.
Packet Lifecycle: From Process to NIC
When an application sends data, it calls send(), triggering a user‑to‑kernel transition that costs hundreds of nanoseconds to microseconds per syscall, especially under high concurrency.
Inside the kernel the data is wrapped in a sk_buff (socket buffer), the universal container for all network traffic. The kernel manipulates pointers to the sk_buff rather than moving the data itself.
Step 1 – Transport Layer
If the socket is TCP, the kernel checks the socket state, computes the sequence number, sets flags (SYN/ACK/FIN), adds the TCP header, and fragments the payload if necessary. For UDP, the kernel simply adds a header and passes the packet down, sacrificing reliability for speed.
At this point the sk_buff is fully identified with source/destination IPs, ports, and protocol.
Step 2 – Network Layer (Routing)
The IP layer looks up the routing table via ip route (the Forwarding Information Base) to decide the outgoing interface and next hop. If the destination is on the same subnet, the packet is sent directly; otherwise it is forwarded to the default gateway. The IP layer also checks the MTU (typically 1500 bytes) and fragments oversized packets.
Step 3 – Data‑Link Layer (ARP)
Because the next hop is identified by a MAC address, the kernel queries the ARP table with ip neigh. If the MAC is cached, it is used; otherwise an ARP request is broadcast to discover the MAC for the target IP.
With the MAC address, the packet is encapsulated into an Ethernet frame.
Step 4 – NIC Driver and Ring Buffer
The Ethernet frame is placed into a shared Ring Buffer between the kernel and the NIC. The NIC uses DMA to pull the frame from the Ring Buffer without CPU involvement, freeing the CPU for other work.
A too‑small Ring Buffer can become a bottleneck; the queue fills and packets are dropped. The current size and limits can be inspected with ethtool -g eth0.
Networking in Virtual Environments
Containers and VMs add extra abstraction layers on top of the Linux network stack:
veth pair : a virtual cable connecting a container’s interface to the host.
bridge (e.g., docker0): a virtual switch that connects all containers.
Network Namespace : each container gets its own network stack, routing table, and iptables rules.
When a container runs curl, the packet travels from the container’s eth0 through the veth pair, the host bridge, undergoes routing and NAT, and finally follows the five steps described above on the host’s physical NIC.
Key Metrics to Diagnose Network Issues
The following metrics cover roughly 80 % of common problems:
Bandwidth : theoretical maximum (1 Gbps, 10 Gbps, 100 Gbps) defined by the NIC.
Throughput : actual data rate, limited by TCP window, latency, and loss. Test with iperf. A mismatch between bandwidth and throughput points to stack or upper‑layer bottlenecks.
Latency : measured with ping. It consists of propagation, processing, and queueing delays. High latency combined with loss is critical.
Packet loss rate : monitor with mtr (more precise than ping). Loss can stem from a full Ring Buffer, a saturated conntrack table, or iptables rules. Hardware‑level drops are visible via ethtool -S eth0.
Connection Tracking table : kernel tracks each connection; default size is 65 536 or 262 144 entries. When full, new connections are dropped. Check usage with conntrack -S.
Socket buffer sizes : controlled by tcp_rmem and tcp_wmem. Too small limits send/receive rates; too large inflates memory usage. Typical values are 4096 131072 6291456 (min/default/max).
TCP retransmission rate : a direct sign of network jitter. Query with nstat -az TcpExtTCPSynRetrans TcpExtTCPRetransCount. A rate above 1 % suggests congestion, instability, or lossy links.
Common Diagnostic Commands
ss(or netstat) – inspect socket states. ip (or ifconfig / route) – manage routes and interfaces. tc – traffic control, queue and bandwidth limits. ethtool – view NIC hardware details, Ring Buffer size, and driver statistics.
These commands together cover four layers: application‑level connections ( ss), network‑layer routing ( ip), link‑layer queues ( tc), and physical‑layer hardware ( ethtool).
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Tech Stroll Journey
The philosophy behind "Stroll": continuous learning, curiosity‑driven, and practice‑focused.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
