Fundamentals 10 min read

Mastering Linux NIC Packet Processing and Ring Buffer Optimization

This guide explains how a network interface card (NIC) receives packets, the role of DMA, interrupt handling, poll functions, and ring buffers, then details multi‑CPU ring buffer handling, key ethtool commands for statistics, buffer size, queue configuration, and hash‑based flow distribution.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Mastering Linux NIC Packet Processing and Ring Buffer Optimization

1. NIC Packet Processing Flow

The diagram shows how a NIC processes incoming network data. The key steps are:

DMA writes each received packet into one or more sk_buff structures; sk_buff access follows FIFO order.

After DMA finishes, the NIC triggers an IRQ via the NIC Interrupt Handler.

The NIC driver registers a poll function.

The poll function checks the data, possibly merging several sk_buff objects that belong to the same packet.

The merged sk_buff is handed to the upper network stack for further processing.

Complete Process

On system boot, the NIC is initialized and a ring buffer is allocated.

Each slot in the ring buffer initially holds a Packet Descriptor pointing to a ready sk_buff.

DMA writes packets into sk_buff; written buffers become used.

DMA completion triggers an IRQ.

The driver registers the poll function.

The poll function merges fragmented sk_buff objects if needed.

The merged buffers are delivered to the network stack.

The poll function cleans up used sk_buff entries and resets the corresponding descriptors to ready.

2. Ring Buffer Handling on Multi‑CPU Systems

When packet arrival rate exceeds the processing speed of a single CPU, the ring buffer can fill up and subsequent packets are dropped. On multi‑core servers, the NIC can provide multiple ring buffers and distribute IRQs across CPUs (RSS or multiqueue). This parallelism improves throughput, but the NIC must support Receive Side Scaling (RSS) or multiqueue; the data path remains the same as described above.

3. Ring Buffer Related Commands

3.1 Packet Statistics

[root@test]$ ethtool -S em1 | more
NIC statistics:
    rx_packets: 35874336743
    tx_packets: 35163830212
    rx_bytes: 6337524253985
    tx_bytes: 3686383656436
    rx_broadcast: 15392577
    tx_broadcast: 873436
    rx_multicast: 45849160
    tx_multicast: 1784024

RX denotes received data, TX denotes transmitted data.

3.2 Drop and Error Counters

[root@test]$ ethtool -S em1 | grep -iE "error|drop"
rx_crc_errors: 0
rx_missed_errors: 0
... (other counters) ...
rx_fifo_errors: 79270
rx_queue_0_drops: 16669
rx_queue_1_drops: 21522
... (other queues) ...

The sum of all queue_drops equals rx_fifo_errors, providing a quick view of ring‑buffer overflow.

3.3 Query Ring Buffer Size

[root@test]$ ethtool -g em1
Ring parameters for em1:
Pre-set maximums:
RX: 4096
TX: 4096
Current hardware settings:
RX: 256
TX: 256

Maximum RX/TX is 4096; the current setting is 256. Larger queues reduce packet loss but increase latency.

3.4 Adjust Number of Queues

[root@test]$ ethtool -l em1
Channel parameters for em1:
Pre-set maximums:
RX: 0
TX: 0
Other: 1
Combined: 8
Current hardware settings:
RX: 0
TX: 0
Other: 1
Combined: 8

Combined = 8 indicates the NIC uses eight processing contexts.

ethtool -L eth0 combined 8

Changes usually require a reboot to take effect.

3.5 Adjust Queue Sizes

[root@test]$ ethtool -G em1 rx 4096
[root@test]$ ethtool -G em1 tx 4096

Setting both RX and TX to 4096 expands the ring buffers.

3.6 Set Queue Weights (RSS Indirection Table)

[root@test]$ ethtool -x em1
RX flow hash indirection table for em1 with 8 RX ring(s):
    0: 0 0 0 0 0 0 0 0
    8: 0 0 0 0 0 0 0 0
   16: 1 1 1 1 1 1 1 1
   ... (continues) ...
RSS hash key:
Operation not supported

The table maps 128 hash values to the eight queues; weights must not exceed the table size (128).

3.7 Change Ring Buffer Hash Fields

[root@test]$ ethtool -n em1 rx-flow-hash tcp4
TCP over IPV4 flows use these fields for computing Hash flow key:
IP SA
IP DA
L4 bytes 0 & 1 [TCP/UDP src port]
L4 bytes 2 & 3 [TCP/UDP dst port]
ethtool -N em1 rx-flow-hash udp4 sdfn

The sdfn parameter is NIC‑specific and must be looked up in the ethtool documentation.

3.8 IRQ Statistics

Reading /proc/interrupts shows per‑CPU IRQ counts, helping verify whether multiqueue and NAPI interrupt coalescing are effective.

References

https://ylgrgyq.github.io/2017/07/23/linux-receive-packet-1/

https://heapdump.cn/article/3947686

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Linuxnetwork performanceRing Bufferethtoolpacket processingNIC
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.