Understanding the TCP Communication Protocol: Features, Handshakes, and Optimizations
This article explains TCP as a connection‑oriented, reliable byte‑stream transport protocol, detailing its header fields, state machine, three‑way handshake, four‑way termination, TIME_WAIT handling, Linux inspection commands, optimization techniques, and a comparison with UDP, all illustrated with concrete examples and diagrams.
TCP Overview
TCP is a connection‑oriented, reliable, byte‑stream transport‑layer protocol. It guarantees ordered delivery, automatic retransmission of lost packets, and discards duplicate segments.
Connection‑oriented: communication occurs between exactly two endpoints.
Reliability: TCP ensures that every segment reaches the receiver despite network changes.
Byte‑stream: Data has no inherent boundaries; the stream is ordered and duplicates are removed.
TCP Header Fields
Sequence number: a random initial value generated during the SYN exchange; it increments by the number of data bytes transmitted and is used to resolve out‑of‑order delivery.
Acknowledgment number: the next expected sequence number; it confirms receipt of all prior bytes, preventing packet loss.
Control bits: ACK (validates acknowledgment field), RST (forces connection reset), SYN (initiates connection and sets initial sequence number), FIN (indicates no more data will be sent).
Network Models
The article contrasts the OSI seven‑layer model with the TCP/IP five‑layer model, showing their correspondence with a diagram.
TCP Connection States
CLOSED – initial state.
LISTEN – server socket waiting for connections.
SYN_RCVD – SYN received.
SYN_SENT – client has sent SYN.
ESTABLISHED – connection established.
TIME_WAIT – waiting for 2 MSL after FIN/ACK exchange.
CLOSING – simultaneous FIN exchange.
CLOSE_WAIT – waiting to close after receiving FIN.
Viewing TCP State on Linux
Use netstat -napt to list current TCP connections and their states.
TIME_WAIT Explanation
TIME_WAIT occurs only on the side that actively closes the connection. It prevents old packets with the same four‑tuple from being accepted and ensures the passive side receives the final ACK.
Problems caused by excessive TIME_WAIT:
Memory consumption.
Port exhaustion (each connection consumes a local port; the default port range is 32768‑61000).
Linux defines the TIME_WAIT duration as 2 MSL (default 60 seconds, with MSL = 30 seconds). The kernel constant is
#define TCP_TIMEWAIT_LEN (60HZ) /* about 60 seconds */. Changing the duration requires recompiling the kernel.
Optimization options (each with trade‑offs):
Enable net.ipv4.tcp_tw_reuse and net.ipv4.tcp_timestamps.
Adjust net.ipv4.tcp_max_tw_buckets.
Use SO_LINGER to force a RST close.
Three‑Way Handshake
1. Client sends SYN (seq = x) → SYN_SENT.
2. Server replies with SYN+ACK (seq = y, ack = x+1) → SYN_RCVD.
3. Client sends ACK (ack = y+1) → ESTABLISHED on both sides. The third handshake can carry data.
Why three handshakes? They prevent stale connections, synchronize initial sequence numbers, and avoid unnecessary resource usage. Two‑handshake schemes cannot detect old connections; four‑handshake schemes add an unnecessary round.
Four‑Way Termination
FIN from client → FIN_WAIT_1.
ACK from server → CLOSE_WAIT; server later sends FIN → LAST_ACK.
Client ACKs server FIN → TIME_WAIT.
After 2 MSL, both sides transition to CLOSED.
The four‑step process ensures that each side can finish sending pending data before fully closing.
TCP vs UDP
TCP: connection‑oriented, reliable, flow‑ and congestion‑controlled, larger header (20 bytes + options).
UDP: connectionless, best‑effort delivery, small fixed header (8 bytes), suitable for low‑latency scenarios.
Initial Sequence Number (ISN)
ISN is generated per connection to avoid confusion with delayed packets. It is based on a timer (incremented every 4 ms) plus a hash of source/destination IPs and ports (RFC 1948 recommends MD5 for the hash).
Reliability Mechanisms
Checksum: 16‑bit one's complement of all data words.
Sequence numbers: order and duplicate detection.
Acknowledgment (ACK): informs sender of received bytes.
Timeout retransmission: resend if ACK not received.
Flow control: window size advertised in the TCP header.
Congestion control: slow start, congestion avoidance, fast retransmit, fast recovery.
Efficiency Improvements
Sliding window: send multiple segments without waiting for each ACK.
Fast retransmit: duplicate ACKs trigger immediate resend.
Delayed ACK: wait up to 200 ms before sending ACK to increase advertised window.
Piggyback ACK: combine ACK with data payload (e.g., during the third handshake).
Congestion Control Phases
Slow start.
Congestion avoidance.
Fast retransmit.
Fast recovery.
Socket Programming Overview
Create socket → obtain file descriptor.
Server: bind → listen (backlog) → accept.
Client: connect.
Data transfer: write (client) / read (server).
Close: client close → server reads EOF → server close.
Additional Q&A
In Linux, listen(socketfd, backlog) historically used backlog for the SYN queue size; since kernel 2.2 it defines the accept queue length, limited by somaxconn. The client’s connect returns after the second handshake, while the server’s accept returns after the third handshake.
When a client calls close, it sends FIN, enters FIN_WAIT_1, and the server eventually receives EOF, processes remaining data, and closes, completing the four‑step termination described above.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Linux Tech Enthusiast
Focused on sharing practical Linux technology content, covering Linux fundamentals, applications, tools, as well as databases, operating systems, network security, and other technical knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
