What Really Happens Inside Linux When recvfrom Receives a Packet?
This article walks through the complete Linux kernel path a network packet follows—from the NIC’s DMA and hardware interrupt, through soft‑interrupt handling, NAPI polling, protocol‑stack registration, IP and UDP processing, all the way to the user‑space recvfrom system call—revealing the many hidden steps that make packet reception possible.
Linux must handle millions of packets per second for large‑scale services, so understanding the exact kernel flow from a NIC receiving a frame to a user‑space recvfrom call is essential for performance tuning.
1. Kernel preparation
During boot the kernel creates a ksoftirqd thread per CPU (see spawn_ksoftirqd in kernel/softirq.c) and registers the soft‑interrupt vectors ( NET_RX_SOFTIRQ, NET_TX_SOFTIRQ, etc.). The network subsystem is initialized by net_dev_init which allocates a softnet_data structure for each CPU, initializes its poll lists, and registers the handlers net_rx_action and net_tx_action via open_softirq.
2. NIC driver initialization
Each driver (e.g., Intel igb) registers itself with pci_register_driver. The probe function ( igb_probe) allocates TX/RX rings, registers an NAPI poll function ( igb_poll) and sets up interrupt handling ( igb_request_irq or igb_request_msix for multi‑queue devices).
3. Hardware interrupt handling
When a frame arrives the NIC writes it via DMA into its ring buffer and raises a hardware interrupt. The interrupt handler does minimal work: it records the interrupt, adds the NAPI struct to the per‑CPU softnet_data.poll_list, and triggers the soft‑interrupt NET_RX_SOFTIRQ with __raise_softirq_irqoff.
4. Soft‑interrupt processing (ksoftirqd)
The ksoftirqd thread sees the pending soft‑interrupt, disables local IRQs, and calls __do_softirq. This iterates over pending soft‑interrupts and invokes the registered actions; for network receive it calls net_rx_action.
5. NAPI poll and packet delivery
net_rx_actionwalks the per‑CPU poll list, invokes each NAPI poll function (e.g., igb_poll), which in turn calls igb_clean_rx_irq. This function fetches descriptors from the ring, builds an sk_buff, performs checksum/VLAN processing, and finally calls napi_gro_receive → netif_receive_skb.
6. Protocol‑stack registration
During boot the kernel registers protocol handlers via dev_add_pack and inet_init. The IP handler ip_rcv is stored in ptype_base, while UDP/TCP handlers ( udp_rcv, tcp_v4_rcv) are stored in inet_protos.
7. IP layer processing
ip_rcvperforms basic validation, runs Netfilter hooks, and then calls ip_route_input_noref. If routing succeeds, dst_input invokes ip_local_deliver, which finally dispatches to the protocol handler based on skb->protocol (e.g., udp_rcv).
8. UDP layer processing
udp_rcvlooks up the matching socket with __udp4_lib_lookup_skb. If a socket is found, the packet is queued on its receive queue via udp_queue_rcv_skb; otherwise an ICMP “port unreachable” is generated.
9. recvfrom system call
The user‑space recvfrom library call enters the kernel as sys_recvfrom, which eventually calls inet_recvmsg. This dispatches to the socket’s protocol operations ( sk->sk_prot->recvmsg), which for UDP is udp_recvmsg. udp_recvmsg invokes __skb_recv_datagram to pull an sk_buff from sk->sk_receive_queue. If the queue is empty and the caller permits blocking, the process sleeps on wait_for_more_packets until the kernel places a new packet there.
In summary, a single recvfrom triggers a cascade of actions across hardware interrupts, soft‑interrupt threads, NAPI polling, protocol registration, routing, and socket queue management, illustrating why kernel‑level networking is one of the most complex subsystems in Linux.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
