Analyzing TCP Handshake Anomalies: Port Exhaustion, Connection Queue Overflow, and Performance Mitigation
This article examines common TCP handshake anomalies in backend systems, detailing how client port exhaustion and server-side connection queue overflows cause packet loss and latency spikes, while providing kernel-level insights and practical configuration strategies to optimize network performance and ensure reliable service availability.
This article examines critical TCP handshake anomalies that degrade backend API performance, focusing on how client port exhaustion and server-side connection queue overflows cause packet loss, CPU spikes, and severe latency increases.
Client-side connect system call failures occur when ephemeral ports are depleted. The kernel iterates through the ip_local_port_range using a spinlock and hash table lookup. When ports are scarce, excessive loop iterations trigger heavy CPU overhead and connection delays, as illustrated in the kernel source:
//file:net/ipv4/inet_hashtables.c
int __inet_hash_connect(...)
{
inet_get_local_port_range(&low, &high);
remaining = (high - low) + 1;
for (i = 1; i <= remaining; i++) {
// offset is a random number
port = low + (i + offset) % remaining;
head = &hinfo->bhash[inet_bhashfn(net, port,
hinfo->bhash_size)];
// acquire lock
spin_lock(&head->lock);
// extensive port selection logic
//......
// success: goto ok
// failure: goto next_port
next_port:
// release lock
spin_unlock(&head->lock);
}
}First handshake packet loss typically stems from server-side queue overflows. If the half-open connection queue is full and tcp_syncookies is disabled, incoming SYN packets are silently dropped. Similarly, a full established connection queue combined with pending half-open requests triggers packet drops. The client initiates retransmissions with an initial 1-second timeout that doubles exponentially, severely degrading API response times.
//file: net/ipv4/tcp_ipv4.c
int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb)
{
// check if half-open queue is full
if (inet_csk_reqsk_queue_is_full(sk) && !isn) {
want_cookie = tcp_syn_flood_action(sk, skb, "TCP");
if (!want_cookie)
goto drop;
}
// check if established queue is full
...
drop:
NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENDROPS);
return 0;
}Third handshake packet loss occurs when the established connection queue is full during the final ACK phase. The server discards the ACK, but the client mistakenly assumes the connection is established and may begin transmitting data. The server retries sending SYN-ACK packets based on the tcp_synack_retries parameter until it gives up, causing silent connection failures and wasted resources.
//file: net/ipv4/tcp_ipv4.c
struct sock *tcp_v4_syn_recv_sock(struct sock *sk, ...)
{
// check if receive queue is full
if (sk_acceptq_is_full(sk))
goto exit_overflow;
...
exit_overflow:
NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
...
}To mitigate these TCP handshake anomalies, engineers should enable tcp_syncookies to prevent half-open queue exhaustion, increase connection queue limits via somaxconn and tcp_max_syn_backlog, ensure applications promptly call accept() to drain queues, and transition to long-lived connections to minimize frequent handshake overhead and improve overall system stability.
Refining Core Development Skills
Fei has over 10 years of development experience at Tencent and Sogou. Through this account, he shares his deep insights on performance.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.