Operations 10 min read

A Complete Linux Network Latency Troubleshooting Guide: From Ping to Kernel Tuning

When a service's response time spikes from milliseconds to hundreds of milliseconds, this guide walks you through a step‑by‑step Linux toolchain—ping, mtr, iftop, netstat, sysctl, and BBR—to pinpoint the root cause of network latency without jumping straight to packet captures.

Tech Stroll Journey
Tech Stroll Journey
Tech Stroll Journey
A Complete Linux Network Latency Troubleshooting Guide: From Ping to Kernel Tuning

When users report that a page cannot open and the interface timeout rate jumps from 0.1% to 15%, a quick ping baidu.com often reveals a latency increase from the usual 2 ms to 800 ms, sometimes even a request timeout. The problem could lie in four places: the client, the intermediate path, the server NIC/kernel, or the remote service.

Chapter 1: Don’t Start with tcpdump – Identify the Slow Segment First

Capturing hundreds of megabytes with tcpdump -i eth0 without knowing where the delay originates is like feeling around in the dark. The first step is layered positioning.

ping – The Basic Door‑Knocker

ping -c 100 -i 0.1 baidu.com

Send 100 packets at 100 ms intervals and examine three values:

Minimum RTT – the physical lower bound (e.g., a fiber loop around the earth is ~130 ms).

Average RTT – large jitter suggests congestion on the path.

Packet loss – even 1% loss is disastrous for TCP because it triggers retransmissions.

Ping tells you that latency is indeed high, but not where it originates. Next, use mtr.

mtr – An X‑Ray of the Path

mtr --report --report-cycles 10 baidu.com

The output shows each hop’s loss, sent packets, last/average/best/worst latency, and standard deviation. If intermediate hops (e.g., hop 3 and hop 4) show 30‑40% loss but later hops drop to 0%, the loss is likely a false positive caused by ICMP rate‑limiting on routers; TCP traffic may still flow normally. The rule of thumb is to trust the loss rate of the final hop.

Chapter 2: Link Layer – Where Is Bandwidth Lost?

If mtr confirms loss at the final hop, the issue is probably either bandwidth saturation or cross‑region backbone jitter.

Check Bandwidth

iftop -n -P
nload
iftop

shows per‑connection bandwidth, while nload displays overall traffic. When outbound bandwidth exceeds 90%, packets queue in the NIC, causing latency spikes.

Check Connection Count

If bandwidth is not maxed out, inspect connection statistics: ss -s Excessive TIME‑WAIT (tens of thousands) or a full listen queue indicates overflow, which also slows new connections. Use:

netstat -s | grep -E "listen|overflow|drop"
cat /proc/interrupts

Watch ListenOverflows and adjust net.core.somaxconn and the application backlog if they are too low.

Chapter 3: Kernel Protocol Stack

High latency may stem from the server’s TCP stack rather than the network.

Retransmission Rate – Packets Sent Repeatedly

TCP retransmissions are the top latency killer. Linux’s retransmission timeout is at least 200 ms, sometimes over a second. netstat -s | grep retrans Divide the retransmitted segment count by total segments; a rate above 1% warrants investigation.

The most common fix is a too‑small TCP receive buffer.

sysctl net.ipv4.tcp_rmem
net.ipv4.tcp_rmem = 4096 131072 6291456

The three numbers are min, default, and max (bytes). The default 128 KB is insufficient for high‑speed links; the bandwidth‑delay product (BDP) for a 10 Mbps link with 200 ms RTT is 250 KB, exceeding the default buffer.

To enlarge the buffers:

sysctl -w net.ipv4.tcp_rmem="4096 87380 16777216"
sysctl -w net.ipv4.tcp_wmem="4096 65536 16777216"

BBR – Google’s Congestion Control

Traditional Cubic relies on packet loss to detect congestion, which is disastrous on high‑latency links. BBR measures bottleneck bandwidth and minimum RTT, building a model to throttle before loss occurs. sysctl net.ipv4.tcp_congestion_control If it shows cubic, switch to BBR: sysctl -w net.ipv4.tcp_congestion_control=bbr Tests on the same machine over a cross‑region link show a 2‑3× throughput increase and over 60% RTT jitter reduction after switching from Cubic to BBR.

Actionable Checklist

ping -c 100 – Verify that latency is indeed high.

mtr --report – Locate the problematic hop and discard ICMP false positives.

iftop / nload – Check if outbound bandwidth is saturated.

netstat -s | grep retrans – Examine retransmission rate.

netstat -s | grep overflow – Detect listen‑queue overflows.

ss -ti – Inspect per‑connection cwnd and RTT.

sysctl net.ipv4.tcp_rmem – Verify receive buffer size.

sysctl net.ipv4.tcp_congestion_control – Ensure BBR is enabled if appropriate.

cat /proc/interrupts – Look for concentrated soft‑interrupts.

In most cases, 80% of latency problems stem from 20% of common causes. Running through this nine‑step checklist usually pinpoints the root cause within five minutes, even without deep networking expertise.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performance Tuningpinglinuxnetwork latencyBBRmtr
Tech Stroll Journey
Written by

Tech Stroll Journey

The philosophy behind "Stroll": continuous learning, curiosity‑driven, and practice‑focused.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.