Backend Development 16 min read

Quantifying TCP Connection Latency: Analysis, Abnormal Cases, and Optimization Strategies

This article provides a detailed quantitative analysis of TCP connection establishment latency, examining normal handshake processes, abnormal scenarios like queue overflows and TIME_WAIT exhaustion, and offering practical optimization strategies for backend systems.

Refining Core Development Skills

Nov 8, 2020

Quantifying TCP Connection Latency: Analysis, Abnormal Cases, and Optimization Strategies

This article provides a comprehensive quantitative analysis of TCP connection establishment latency, moving beyond vague claims of high overhead to deliver precise measurements and practical insights for backend development.

Under normal conditions, the TCP three-way handshake primarily depends on network Round-Trip Time (RTT). While kernel operations like system calls, soft interrupts, and context switches consume only a few microseconds, network transmission dominates the latency. From the client's perspective, connection establishment roughly equals one RTT.

However, abnormal scenarios can drastically increase latency. When a server experiences high TIME_WAIT connections, local port exhaustion can cause the connect system call overhead to surge from microseconds to milliseconds. More critically, if the server's half-open or fully-established connection queues overflow, incoming SYN or ACK packets are dropped, triggering TCP retransmissions that add seconds of delay. This can severely degrade user experience and potentially trigger service avalanches in thread-pool architectures.

Practical benchmarking confirms these theoretical models. The author provides a PHP script to measure connection latency:

<?php
$ip = {服务器ip};
$port = {服务器端口};
$count = 50000;
function buildConnect($ip,$port,$num){
    for($i=0;$i<$num;$i++){
        $socket = socket_create(AF_INET,SOCK_STREAM,SOL_TCP);
        if($socket ==false) {
            echo "$ip $port socket_create() 失败的原因是:".socket_strerror(socket_last_error($socket))."
";
            sleep(5);
            continue;
        }
        if(false == socket_connect($socket, $ip, $port)){
            echo "$ip $port socket_connect() 失败的原因是:".socket_strerror(socket_last_error($socket))."
";
            sleep(5);
            continue;
        }
        socket_close($socket);
    }
}
$t1 = microtime(true);
buildConnect($ip, $port, $count);
echo (($t2-$t1)*1000).'ms';

Monitoring tools are essential for detecting queue drops. Use netstat -s | grep LISTEN and netstat -s | grep overflowed to track drops. Optimization strategies include expanding queue lengths via echo "2048" > /proc/sys/net/ipv4/tcp_max_syn_backlog and echo "256" > /proc/sys/net/core/somaxconn, enabling tcp_abort_on_overflow to fail fast, utilizing connection pools, and co-locating services to minimize physical network distance.

Ultimately, understanding TCP latency quantification allows engineers to make informed architectural decisions, such as deploying regional servers for global users and implementing robust connection management to ensure system stability and optimal performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Linux kernel network latency connection-queues TCP Protocol

Written by

Refining Core Development Skills

Fei has over 10 years of development experience at Tencent and Sogou. Through this account, he shares his deep insights on performance.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.