Why Do TIME_WAIT Connections Surge in High‑Concurrency Scenarios and How to Fix Them
During high‑concurrency traffic, servers can accumulate large numbers of TCP connections in the TIME_WAIT state, which can exhaust local ports and cause “address already in use” errors; this article explains the phenomenon, its underlying TCP mechanics, and practical configuration and kernel tweaks to mitigate the issue.
1. Problem Description
In simulated high‑concurrency scenarios, a batch of TCP connections enter the TIME_WAIT state.
After a short time the TIME_WAIT connections disappear as they are reclaimed, and services resume normally; thus TIME_WAIT is a normal phenomenon under high load.
In production, continuous high load may cause some TIME_WAIT connections to be reclaimed while new ones are created, and in extreme cases a large number of TIME_WAIT connections appear.
Impact: each TIME_WAIT occupies a local port (max 65535). When many connections are in TIME_WAIT, new TCP connections can fail with “address already in use: connect”.
2. Problem Analysis
The root causes of massive TIME_WAIT connections are:
Numerous short‑lived connections.
When the HTTP
Connectionheader is set to
close, the server actively closes the connection.
The TCP four‑handshake termination keeps the socket in TIME_WAIT for twice the Maximum Segment Lifetime (MSL) to ensure ACK retransmission and discard delayed packets.
TIME_WAIT is the state of the side that actively closes the connection; it lasts for 2 × MSL (typically 4 minutes, as MSL is 2 minutes).
3. Solutions
General ways to reduce the impact of excessive TIME_WAIT:
Client side: set HTTP
Connectionto
keep-aliveto keep connections alive.
Server side:
Enable reuse of sockets in TIME_WAIT (e.g., set
SO_REUSEADDR).
Reduce TIME_WAIT duration to 1 MSL (≈2 minutes) via kernel parameters.
Reference: https://www.cnblogs.com/yjf512/p/5327886.html
4. Appendices
Appendix A – Query TCP connection states
On macOS, use:
<code># Query TIME_WAIT connections
$ netstat -nat | grep TIME_WAIT
# Count connections by state
$ netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'</code>Appendix B – MSL (Maximum Segment Lifetime)
MSL is the maximum time a packet can exist in the network; RFC 793 defines it as 2 minutes, though implementations may use 30 seconds, 1 minute, or 2 minutes.
Appendix C – TCP three‑way handshake and four‑way termination
Illustrations and detailed explanations can be found in external resources.
TCP TIME_WAIT is essential for reliable connection termination and for discarding delayed packets; without it, ACK loss or stray packets could corrupt new connections.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.