How Linux Receives Network Packets: From NIC Interrupts to recvfrom
This article provides a detailed, step‑by‑step walkthrough of the Linux kernel’s packet‑receiving path, covering NIC DMA, hardware and soft interrupts, ksoftirqd threads, NAPI polling, protocol‑stack registration, IP/UDP processing, and the final recvfrom system call that delivers data to user space.
When a Linux server must handle millions of concurrent network connections, understanding the exact path a packet takes from the NIC to the user‑space recvfrom call is essential for performance tuning.
1. Linux Network Stack Overview
The TCP/IP model is split into physical, link, network and transport layers. In Linux the link‑layer driver, the network‑layer (IP) and the transport‑layer (TCP/UDP) are implemented inside the kernel, while user programs access them via socket APIs.
2. Kernel Preparation Before Receiving Packets
Before any packet can be processed the kernel must initialise several components:
Create the ksoftirqd per‑CPU kernel threads (see spawn_ksoftirqd in kernel/softirq.c).
Register the network subsystem via subsys_initcall(net_dev_init), which allocates a softnet_data structure for each CPU and registers soft‑interrupt handlers such as NET_RX_SOFTIRQ with open_softirq.
Initialise the NIC driver (e.g., Intel igb) by calling pci_register_driver and setting up net_device_ops (open, stop, xmit, etc.).
Start the NIC: allocate TX/RX descriptors, request IRQs/MSI‑X, enable NAPI, and bind the driver’s poll function ( igb_poll) to the NIC.
3. Packet Arrival – Hardware Interrupt
The NIC DMA‑writes the incoming frame into its ring buffer and raises a hardware interrupt. The interrupt handler does only minimal work:
static irqreturn_t igb_msix_ring(int irq, void *data) {
struct igb_q_vector *q_vector = data;
igb_write_itr(q_vector);
napi_schedule(&q_vector->napi);
return IRQ_HANDLED;
}It records the interrupt and schedules the NAPI poll via napi_schedule, which ultimately raises the soft‑interrupt NET_RX_SOFTIRQ.
4. Soft‑Interrupt Processing by ksoftirqd
The per‑CPU ksoftirqd thread runs run_ksoftirqd. When NET_RX_SOFTIRQ is pending it calls __do_softirq, which dispatches the registered action net_rx_action:
static void net_rx_action(struct softirq_action *h) {
struct softnet_data *sd = &__get_cpu_var(softnet_data);
unsigned long time_limit = jiffies + 2;
int budget = netdev_budget;
local_irq_disable();
while (!list_empty(&sd->poll_list)) {
struct napi_struct *n = list_first_entry(&sd->poll_list, struct napi_struct, poll_list);
if (test_bit(NAPI_STATE_SCHED, &n->state))
n->poll(n, weight);
budget -= work;
}
local_irq_enable();
}The NAPI poll function for the igb driver is igb_poll, which retrieves packets from the NIC ring buffer and passes them to the generic network stack via napi_gro_receive (GRO aggregation) and finally netif_receive_skb.
5. Network Stack – IP Layer
netif_receive_skbforwards the packet to __netif_receive_skb, which looks up the protocol handler in the ptype_base hash table. For IPv4 packets the handler is ip_rcv:
int ip_rcv(struct sk_buff *skb, struct net_device *dev,
struct packet_type *pt, struct net_device *orig_dev) {
return NF_HOOK(NFPROTO_IPV4, NF_INET_PRE_ROUTING, skb, dev, NULL,
ip_rcv_finish);
}After routing (via ip_route_input_noref) the packet reaches dst_input, which calls ip_local_deliver. This function selects the transport‑layer handler from inet_protos[protocol] – for UDP it is udp_rcv and for TCP it is tcp_v4_rcv.
6. UDP Processing
int udp_rcv(struct sk_buff *skb) {
return __udp4_lib_rcv(skb, &udp_table, IPPROTO_UDP);
}The helper looks up the matching socket with __udp4_lib_lookup_skb. If a socket is found, the packet is queued with udp_queue_rcv_skb. If no socket matches, an ICMP “port unreachable” is generated.
7. Socket Receive Path (recvfrom)
When an application calls recvfrom, the glibc wrapper invokes the system call sys_recvfrom, which ends up in inet_recvmsg (the recvmsg entry of inet_dgram_ops). This function forwards to the protocol‑specific udp_recvmsg via the socket’s sk_prot:
int inet_recvmsg(struct kiocb *iocb, struct socket *sock,
struct msghdr *msg, size_t size, int flags) {
return sock->sk_prot->recvmsg(iocb, sock->sk, msg, size,
flags & MSG_DONTWAIT,
flags & ~MSG_DONTWAIT, &addr_len);
}The UDP implementation finally calls __skb_recv_datagram, which extracts a packet from sk->sk_receive_queue. If the queue is empty and the caller allows blocking, the thread sleeps in wait_for_more_packets until the kernel places a new packet there (via the path described above).
8. Summary of the Full Receive Path
1. Kernel initialises ksoftirqd, registers soft‑interrupts, and brings up the NIC.
2. NIC DMA‑writes the frame, raises a hardware interrupt, and the ISR schedules NAPI.
3. ksoftirqd processes NET_RX_SOFTIRQ, invoking net_rx_action.
4. NAPI poll ( igb_poll) pulls the packet from the ring buffer and calls netif_receive_skb.
5. The packet traverses the protocol hash tables: ptype_base → ip_rcv → routing → ip_local_deliver → udp_rcv (or TCP).
6. The UDP layer queues the packet on the matching socket’s receive queue.
7. The user‑space recvfrom system call reads the packet from that queue, possibly sleeping if no data is available. Understanding each of these steps helps developers pinpoint CPU overhead, optimise interrupt handling, and tune kernel parameters such as net.core.rmem_max or IRQ affinity to achieve high‑performance networking.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
