Why Do TIME_WAIT Connections Accumulate in High‑Traffic Servers and How to Fix Them?
Under high‑traffic conditions, servers can generate massive numbers of TCP TIME_WAIT sockets, consuming local ports and causing “address already in use” errors; this article explains the underlying causes, impact on services, and practical mitigation strategies such as keep‑alive headers and socket reuse.
Problem Description
In simulated high‑concurrency scenarios, a large batch of TCP connections enter the TIME_WAIT state.
After a short period, these TIME_WAIT connections disappear as they are reclaimed, indicating that their temporary presence is normal under heavy load.
In sustained high‑traffic environments, two patterns are observed:
Some TIME_WAIT connections are reclaimed while new ones are created.
In extreme cases, a massive number of TIME_WAIT connections appear.
Think: What business impact can a large number of TIME_WAIT connections cause?
When Nginx acts as a reverse proxy, a flood of short‑lived connections can cause many sockets on the Nginx host to remain in TIME_WAIT:
Each TIME_WAIT state consumes a local port, limited to 65,535 (16‑bit).
When many connections are in TIME_WAIT, new TCP connections may fail with
address already in use: connecterrors.
Example of counting TCP connection states:
<code>// Count connections by state
$ netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
ESTABLISHED 1154
TIME_WAIT 1645</code>Tip: The maximum number of local TCP ports is 65,535 because the TCP header uses a 16‑bit field for the port number.
Problem Analysis
The root causes of a large number of TIME_WAIT sockets are:
Many short‑lived connections.
In HTTP, when the
Connectionheader is set to
close, the server actively closes the connection.
The TCP four‑handshake termination keeps the socket in TIME_WAIT for twice the Maximum Segment Lifetime (MSL) to ensure ACK retransmission and delayed data handling.
TIME_WAIT details:
It appears on the side that actively closes the connection (receives FIN, sends ACK, then enters TIME_WAIT).
The state lasts for 2 × MSL, typically 4 minutes (MSL ≈ 2 minutes).
Solution
To mitigate the issue of excessive TIME_WAIT sockets that prevent new connections, consider the following approaches:
Client side: set the HTTP
Connectionheader to
keep-aliveso the connection stays open for a while (modern browsers already do this).
Server side:
Allow reuse of sockets that are in TIME_WAIT.
Reduce the TIME_WAIT duration, e.g., configure it to 1 MSL (about 2 minutes).
Key Takeaways
The side that initiates the active close enters TIME_WAIT.
TIME_WAIT defaults to 2 × MSL (generally 4 minutes).
Ports occupied by TIME_WAIT sockets cannot be reused until the state expires.
The total number of TCP ports is limited to 65,535.
Excessive TIME_WAIT sockets can cause new connection attempts to fail with
address already in useerrors.
In practice, servers often disable active close, but HTTP
Connection: closecan still trigger it.
Modern browsers usually send
Connection: keep-alive, reducing the problem.
In Nginx reverse‑proxy scenarios, many short connections may still generate TIME_WAIT on the backend.
Mitigation: enable socket reuse and shorten the TIME_WAIT timer.
Appendix
Additional topics:
How to query TCP connection states.
Understanding MSL (Maximum Segment Lifetime).
TCP three‑way handshake and four‑way termination.
Example commands on macOS to list TIME_WAIT sockets:
<code>// List TIME_WAIT connections on macOS
$ netstat -nat | grep TIME_WAIT
// Using extended grep pattern
$ netstat -nat | grep -E "TIME_WAIT|Local Address"
Proto Recv-Q Send-Q Local Address Foreign Address (state)
tcp4 0 0 127.0.0.1.1080 127.0.0.1.59061 TIME_WAIT
// Count connections by state (same as earlier)
$ netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
ESTABLISHED 1154
TIME_WAIT 1645</code>MSL definition: the maximum time a TCP segment can exist in the network before being discarded. RFC 793 specifies MSL as 2 minutes, though implementations often use 30 seconds, 1 minute, or 2 minutes.
During the TIME_WAIT period, both ends cannot reuse the port; after the 2 MSL interval, the port becomes available again. The
SO_REUSEADDRsocket option can allow earlier reuse.
Illustration of TCP three‑way handshake and four‑way termination:
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.