Fundamentals 28 min read

Linux Network Packet Sending Process: Deep Dive into Kernel Implementation

This comprehensive article provides an in-depth analysis of how Linux kernel sends network packets, covering the complete process from user-space send() call through protocol stack processing to hardware transmission, with detailed source code examination and performance considerations.

Refining Core Development Skills
Refining Core Development Skills
Refining Core Development Skills
Linux Network Packet Sending Process: Deep Dive into Kernel Implementation

This article provides a comprehensive analysis of the Linux kernel network packet sending process, starting from the user-space send() system call and tracing through the entire transmission pipeline. The author begins by addressing three common questions about network performance monitoring and then systematically explains each stage of packet transmission.

The article covers network interface initialization, including RingBuffer allocation for multiple transmit and receive queues. It explains how accept() creates new sockets for client connections and then dives into the send() system call implementation, which ultimately calls inet_sendmsg() in the protocol stack.

The transmission layer processing is detailed extensively, showing how tcp_sendmsg() allocates kernel skb structures, copies user data, and manages the send queue. The article explains when actual transmission occurs based on conditions like window size and Nagle's algorithm. It then traces through the network layer (ip_queue_xmit), including routing table lookup, IP header construction, netfilter filtering, and potential packet fragmentation for large packets.

The neighbor subsystem is explained as the bridge between network and data link layers, handling ARP resolution and MAC address encapsulation. The network device subsystem is covered in detail, including transmit queue selection, qdisc (queue discipline) processing, and the distinction between user-space kernel time versus softirq processing time.

The article provides specific implementation details for the Intel igb network driver, showing how skb structures are mapped to DMA-accessible memory and placed in RingBuffers. It explains the interrupt-driven completion process and memory cleanup through softirqs.

Throughout the article, the author addresses the three opening questions: why NET_RX softirq counts are higher than NET_TX, whether to monitor system or softirq CPU time for network transmission, and the multiple memory copy operations involved in packet sending. The article concludes with a summary diagram and practical insights for network performance optimization.

performance monitoringDMATCP/IPLinux Kernelnetwork stackringbufferigb driverpacket transmissionSoftIRQ
Refining Core Development Skills
Written by

Refining Core Development Skills

Fei has over 10 years of development experience at Tencent and Sogou. Through this account, he shares his deep insights on performance.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.