Mastering TCP: Handshakes, Flow Control, Congestion Control & More
This comprehensive guide covers TCP fundamentals—including differences from UDP, the three‑way and four‑way handshakes, half‑open queues and SYN flood attacks, header fields, Fast Open, timestamps, RTO calculation, flow and congestion control, Nagle algorithm, delayed ACKs, and keep‑alive mechanisms—providing clear explanations and practical examples for engineers.
TCP vs UDP
TCP is a connection‑oriented, reliable, byte‑stream transport protocol. UDP is connection‑less and provides datagram delivery without reliability guarantees. TCP offers three core properties: connection orientation, reliability (stateful tracking and retransmission), and a byte‑stream abstraction.
Three‑Way Handshake
The client initiates the connection with a SYN (state SYN‑SENT). The server replies with SYN+ACK (state SYN‑RCVD). The client acknowledges with ACK, and both sides transition to ESTABLISHED.
Four‑Way Termination
When a client wants to close, it sends FIN (state FIN‑WAIT‑1, half‑close). The server acknowledges with ACK (state CLOSE‑WAIT). The server then sends its own FIN (state LAST‑ACK). The client replies ACK and enters TIME‑WAIT for 2 MSL before finally closing.
Half‑Open Queue and SYN Flood
Before the handshake, a listening server creates two queues: a half‑open (SYN) queue and a full‑open (ACCEPT) queue. Each incoming SYN places the connection in the SYN queue ( SYN_RCVD). A SYN flood overwhelms the SYN queue with spoofed SYN packets, exhausting resources and causing denial‑of‑service. Mitigations include enlarging the SYN queue, reducing SYN‑ACK retries, and deploying SYN cookies that allocate resources only after a valid ACK containing the cookie.
TCP Header Fields
The TCP header (in bytes) contains source port, destination port, a 32‑bit sequence number, acknowledgment number, flags (SYN, ACK, FIN, RST, PSH), window size (with optional scaling), checksum, and optional fields such as Timestamp, MSS, SACK, and Window Scale.
TCP Fast Open (TFO)
TFO reduces latency by embedding a server‑generated SYN cookie in the initial SYN option. The client caches the cookie; on subsequent connections it sends SYN + cookie + application data (e.g., an HTTP request). The server validates the cookie and can return data before the final ACK, saving one RTT.
Timestamp Option
Kind = 8, length = 10. The 10‑byte option carries a 4‑byte timestamp and a 4‑byte echo. It enables accurate RTT measurement (using ta2 ‑ ta1) and distinguishes packets when sequence numbers wrap, preventing ambiguity.
Retransmission Timeout (RTO) Calculation
Classic method
Maintain a smoothed RTT (SRTT): SRTT = α·SRTT + (1‑α)·RTT (α≈0.8‑0.9). Compute RTO = min(ubound, max(lbound, β·SRTT)) (β≈1.3‑2.0). This method reacts slowly to RTT spikes.
Jacobson/Karels (standard) method
Update SRTT: SRTT = (1‑α)·SRTT + α·RTT with α≈1/8.
Update RTT variance: RTTVAR = (1‑β)·RTTVAR + β·|RTT ‑ SRTT| with β≈0.25.
Compute RTO: RTO = μ·SRTT + δ·RTTVAR (μ=1, δ=4).
Flow Control
TCP uses a sliding window advertised by the receiver (rwnd). The sender’s effective window is min(rwnd, cwnd). As data are acknowledged, the sender advances SND.UNA and adjusts the window size. If the receiver’s buffer fills, it reduces rwnd, throttling the sender.
Congestion Control
Two key variables: congestion window ( cwnd) and slow‑start threshold ( ssthresh).
Slow start
cwnd doubles each RTT until it reaches ssthresh.
Congestion avoidance
cwnd increases by 1/cwnd per ACK, effectively adding one MSS per RTT.
Fast retransmit
After three duplicate ACKs, the missing segment is retransmitted immediately.
Selective ACK (SACK)
The receiver informs the sender which blocks have arrived, allowing targeted retransmission.
Fast recovery
On fast retransmit, set ssthresh = cwnd/2, cwnd = ssthresh, then grow cwnd linearly.
Nagle Algorithm and Delayed ACK
Nagle batches small segments: after the first small packet, subsequent data are sent only when the MSS is reached or all previous data are ACKed. Delayed ACK postpones ACK transmission (≤500 ms) to combine acknowledgments, reducing packet overhead. Immediate ACKs are required for large frames, quick‑ack mode, or out‑of‑order packets.
TCP Keep‑Alive
Keep‑alive probes detect dead connections. Linux defaults are net.ipv4.tcp_keepalive_time = 7200 s, net.ipv4.tcp_keepalive_intvl = 75 s, and net.ipv4.tcp_keepalive_probes = 9. Many applications disable keep‑alive because the default interval (2 hours) is too long for timely detection of dead sockets.
Source: https://juejin.cn/post/6844904070889603085
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Liangxu Linux
Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
