Operations 36 min read

Why Does Your Network Lose Packets? 30+ Troubleshooting Steps Revealed

This article explains the concept of network packet loss, walks through the sending and receiving mechanisms of Ethernet frames, and provides a comprehensive, layer‑by‑layer checklist—including hardware NIC, driver, kernel, IP, TCP/UDP, and application‑level diagnostics—plus concrete Linux commands and configuration tweaks to locate and fix the root cause.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Why Does Your Network Lose Packets? 30+ Troubleshooting Steps Revealed

Network packet loss occurs when fewer bytes are received than were sent, often revealed by incomplete ping replies or missing data.

Packet Transmission Basics

Application data is encapsulated with a TCP header, then an IP header, followed by a 14‑byte MAC header to form a frame. The NIC adds synchronization and CRC before transmitting the packet onto the wire. On reception, the NIC checks CRC, strips the MAC header, places the frame into a ring buffer, and the kernel processes it through the TCP/IP stack before delivering it to the application.

Diagnostic Philosophy

Analyze packet loss from the bottom up: start with the NIC hardware, then the driver, and finally the kernel protocol stack, checking key counters at each layer to pinpoint the failure point.

Common Loss Scenarios and Solutions

1. Hardware NIC Loss

Ring Buffer Overflow : When incoming packet rate exceeds kernel processing speed, the NIC’s ring buffer fills and new packets are dropped. Check with ethtool -S eth0 | grep rx_fifo or /proc/net/dev. Increase buffer size with ethtool -G eth0 rx 4096 tx 4096.

Port Negotiation Issues : Verify link speed and duplex with ethtool eth0. Re‑negotiate using ethtool -r eth0 or force settings with ethtool -s eth0 speed 1000 duplex full autoneg off.

Flow Control : View flow‑control stats via ethtool -S eth0 | grep control. Disable flow control with ethtool -A eth0 tx off rx off.

MAC Address Mismatch : Ensure the NIC operates in non‑promiscuous mode and that ARP tables are up‑to‑date. Refresh ARP or set static entries if needed.

Other NIC Anomalies : Check firmware version with ethtool -i eth0 and update if buggy. Verify physical cabling (CRC errors) with ethtool -S eth0.

2. Driver‑Related Loss

Monitor rx_errors, rx_dropped, rx_overruns, and rx_frame counters via /proc/net/dev or ethtool -S. Adjust net.core.netdev_max_backlog (default 1000) with sysctl -w net.core.netdev_max_backlog=2000 if the backlog overflows.

Balance IRQs across CPUs: check /proc/interrupts, use irqbalance, or manually set CPU affinity via /proc/irq/*/smp_affinity. Configure RSS queues with ethtool -x eth0 and ethtool -X eth0 ….

Enable or tune interrupt coalescing with ethtool -c eth0 and ethtool -C eth0 adaptive-rx on.

3. Kernel Protocol‑Stack Loss

ARP/Neighbor Issues : Check sysctl -a | grep arp_ignore and sysctl -a | grep arp_filter. Set appropriate values (e.g., arp_ignore=1, arp_filter=1) to avoid wrong‑MAC replies.

ARP Table Overflow : Monitor /proc/net/stat/arp_cache and adjust net.ipv4.neigh.default.gc_thresh* thresholds (e.g., sysctl -w net.ipv4.neigh.default.gc_thresh3=4096).

Routing Drops : Verify routes with ip r get X.X.X.X and check for "dropped because of missing route" via netstat -s. Correct routing tables or policy routing as needed.

Reverse‑Path Filtering : Adjust /proc/sys/net/ipv4/conf/eth0/rp_filter (0, 1, or 2) to match the network environment.

Firewall Drops : Inspect iptables -nvL | grep DROP and modify rules accordingly.

Connection‑Tracking Overflow : View /proc/sys/net/netfilter/nf_conntrack_max and increase limits or reduce timeout values (e.g., net.ipv4.tcp_max_tw_buckets).

TCP SYN Queue : Increase net.ipv4.tcp_max_syn_backlog and ensure somaxconn is large enough for high‑concurrency servers.

TIME_WAIT Exhaustion : Enable net.ipv4.tcp_tw_reuse and net.ipv4.tcp_tw_recycle only in non‑NAT environments; adjust net.ipv4.tcp_max_tw_buckets if memory permits.

MTU/Fragmentation Issues : Verify interface MTU with ifconfig eth0, enable TCP MTU probing ( net.ipv4.tcp_mtu_probing=2), and adjust net.ipv4.ipfrag_time, net.ipv4.ipfrag_high_thresh, or net.ipv4.ipfrag_low_thresh for fragment reassembly problems.

TCP Congestion Control : Check current algorithm via sysctl net.ipv4.tcp_congestion_control. For latency‑sensitive workloads, consider disabling BBR’s ProbeRTT or switching to Cubic.

TCP Reordering and PAWS : Tune net.ipv4.tcp_reordering and disable tcp_tw_recycle in NAT scenarios to avoid timestamp‑related drops.

UDP Buffering : Increase net.ipv4.udp_mem, net.ipv4.udp_rmem_min, and net.ipv4.udp_wmem_min for high‑rate UDP traffic.

Socket Buffer Sizes : Adjust net.core.rmem_default, net.core.rmem_max, net.core.wmem_default, and net.core.wmem_max to match the calculated Bandwidth‑Delay Product (BDP) of the link.

4. Application‑Level Loss

Inspect socket receive errors with netstat -s | grep "packet receive errors" and increase net.core.rmem_max if needed.

For UDP‑based services, ensure the application is designed for lossy transport (e.g., use retries or forward error correction) and avoid blocking operations between receives.

When sending too fast, enlarge the send buffer ( net.core.wmem_max) and consider enabling TCP_QUICKACK or disabling Nagle’s algorithm.

Tools for Deep Analysis

dropwatch : Monitors kernel kfree_skb events and prints stack traces where packets are dropped.

tcpdump / tshark : Capture and filter traffic for detailed inspection; use Wireshark for GUI analysis.

Conclusion

Packet loss can stem from any layer of the network stack, from physical NIC buffers to high‑level application logic. By following a systematic, layered approach—checking hardware counters, driver settings, kernel parameters, routing, and application buffers—engineers can quickly locate the offending component and apply the appropriate configuration changes to restore reliable communication.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Kernelnetwork troubleshootingTCPdiagnosticsethtoolPacket Loss
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.