Why Do TIME_WAIT Connections Accumulate and How to Fix Them?
This article explains why massive TIME_WAIT TCP connections appear under high concurrency, the impact on services, and practical methods—such as adjusting socket reuse and reducing TIME_WAIT duration—to prevent new connection failures.
Problem Description
In high‑concurrency scenarios many TCP connections enter the TIME_WAIT state, which is normal but can become problematic when the number grows large.
Typical symptoms include a burst of TIME_WAIT sockets that later disappear after being reclaimed, and in extreme cases new connections failing with address already in use: connect errors.
Analysis
The root causes are:
大量的短连接(especially when HTTP Connection: close is used).
Active close of the connection triggers the four‑handshake termination, leaving the initiator in TIME_WAIT for 2 × MSL.
TIME_WAIT details:
It is the state of the side that actively closes the connection.
It lasts for 2 × MSL (typically 2 × 2 minutes = 4 minutes).
During this period the local port cannot be reused, and the system has only 65 535 ports.
Statistics
$ netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
ESTABLISHED 1154
TIME_WAIT 1645Tip: The maximum number of local TCP ports is 65 535 because the port field is 16 bits.
Solutions
To mitigate the issue:
Client side: set Connection: keep-alive in HTTP headers to keep connections alive.
Server side: enable reuse of sockets in TIME_WAIT state (e.g., SO_REUSEADDR).
Reduce the TIME_WAIT duration, e.g., configure the kernel to use 1 MSL (≈2 minutes) instead of the default 2 MSL.
Conclusion
Key points:
TIME_WAIT consumes a local port and prevents its reuse until the timeout expires.
Excessive TIME_WAIT can cause address already in use errors for new connections.
Adjusting socket reuse options and shortening the TIME_WAIT timer are effective mitigations.
Appendix
Query TCP Connection States (Linux/macOS)
# Linux example
netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}'
# macOS examples
netstat -nat | grep TIME_WAIT
netstat -nat | grep -E "TIME_WAIT|Local Address"MSL (Maximum Segment Lifetime)
RFC 793 defines MSL as 2 minutes; implementations often use 30 seconds, 1 minute, or 2 minutes.
TCP Handshake and Four‑Way Close
Four‑way termination ensures reliable connection teardown and discards delayed packets; the side that sends the final ACK enters TIME_WAIT.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Linux
Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
