Fundamentals 26 min read

Mastering TCP: Header Structure, Handshake, Flow & Congestion Control Explained

This comprehensive guide delves into TCP’s core mechanisms—including header fields, segmentation and reassembly, the three‑way handshake and four‑step termination, sliding‑window flow and congestion control, retransmission strategies, and connection design patterns—providing a solid foundation for network engineers and developers.

AI Cyberspace
AI Cyberspace
AI Cyberspace
Mastering TCP: Header Structure, Handshake, Flow & Congestion Control Explained

TCP Transmission Control Protocol

TCP (Transmission Control Protocol) is a connection‑oriented reliable transport protocol that provides an error‑free, loss‑less, non‑duplicate, in‑order byte‑stream service. It favors reliability over speed, resulting in higher overhead and lower throughput.

TCP Header Format

Source Port (2 Byte) : Sender port number.

Destination Port (2 Byte) : Receiver port number.

Sequence Number (4 Byte) : Unique number for each segment, used as ISN during connection setup.

Acknowledgment Number (4 Byte) : Next expected sequence number from the receiver.

Offset (4 bit) : Header length in 4‑byte units.

Reserved (4 bit) : Unused.

TCP flags (1 Byte) : Control bits.

C (ECN CWR) : Congestion Window Reduced.

E (ECN‑Echo) : Echoes congestion indication.

U (URGENT) : Urgent pointer valid.

A (ACK) : Acknowledgment field valid.

P (PUSH) : Prompt delivery to application.

R (RESET) : Reset the connection.

S (SYN) : Synchronize sequence numbers to establish a connection.

F (FIN) : Finish – request connection termination.

Window Size (2 Byte) : Receiver’s advertised window.

CheckSum (2 Byte) : Error detection for header and payload.

Urgent Pointer (2 Byte) : Valid when U flag is set, points to urgent data.

Options : Variable‑length extensions.

Segmentation and Reassembly

TCP uses a byte‑stream model. When the stream exceeds the path MTU, it is split into fixed‑size TCP segments (fragmentation). The Maximum Segment Size (MSS) limits the payload length and is negotiated during the three‑way handshake, often based on the path MTU.

Example: with MSS = 1460 bytes, a 2000‑byte stream is sent as two IP packets: one carrying 1460 bytes of payload and another carrying 540 bytes.

Sticky Packets and Unpacking

If an application message is larger than MSS, it is fragmented; if it is smaller, multiple messages may be coalesced into a single TCP segment (sticky packet). Applications must handle boundaries, fixed‑length messages, or include length/type fields to separate messages.

Define message delimiters (e.g., EOF, newline).

Use fixed‑length messages with padding.

Separate control and data messages and include length information.

Three‑Way Handshake and Four‑Step Termination

NOTE: Understanding the handshake requires familiarity with the TCP header.

1. Three‑Way Handshake

Client sends SYN with initial sequence number x.

Server replies with SYN = 1, ACK = x+1, and its own sequence number y.

Client acknowledges with ACK = y+1, establishing the ESTABLISHED state on both sides.

Why three steps?

The handshake confirms bidirectional segment transmission and synchronizes initial sequence numbers (ISN) for both sides, preventing sequence ambiguity and ensuring ordered delivery.

2. Data Transfer

Client writes data with Seq = x+1, ACK = y+1.

Server reads and acknowledges with ACK = x+2.

3. Four‑Step Termination

Client sends FIN, enters FIN_WAIT_1.

Server acknowledges, enters CLOSE_WAIT.

Server sends its own FIN, enters LAST_ACK.

Client acknowledges, both sides reach CLOSED.

Why four steps?

Because TCP is full‑duplex; each direction must be closed independently to ensure all data is transmitted and acknowledged.

TCP Connection State Machines

Client State Machine

Server State Machine

Connection Design Patterns

TCP itself does not define “short” or “long” connections; those are application‑level patterns. Short connections perform a single request/response before closing, while long connections keep the socket open for multiple exchanges, reducing handshake overhead and latency.

Short Connection

Advantages: simple management, all connections are useful. Disadvantages: can cause long wait times if the server is slow or crashes.

Long Connection

Advantages: lower latency, reduced bandwidth usage, fewer system resources per request. Disadvantage: requires the server to track idle connections and may need limits on concurrent connections.

ACK Confirmation Mechanism

Seq & ACK Calculation

ACK number increments by the number of bytes transmitted, not by packet count.

ACK = Seq + transmitted_bytes + 1
Seq = ACK

Retransmission Mechanisms

Timeout Retransmission

Sender sets a timer; if no ACK arrives before timeout, the segment is resent. Occurs when the segment or its ACK is lost.

RTT (Round‑Trip Time)

Time for a packet to travel to the receiver and back.

RTO (Retransmission Timeout)

Timeout value must be slightly larger than RTT; too large slows recovery, too small causes unnecessary retransmissions.

Dynamic RTT and RTO

TCP samples RTT, computes a smoothed RTT (SRTT) and deviation (DevRTT), then calculates RTO = SRTT + 4·DevRTT (RFC 6298). Linux uses α = 0.125, β = 0.25.

Fast Retransmit

When three duplicate ACKs are received, the sender infers a lost segment and retransmits it immediately, without waiting for timeout.

SACK (Selective Acknowledgment)

SACK adds option fields that allow the receiver to inform the sender exactly which blocks of data have been received, enabling selective retransmission.

D‑SACK (Duplicate SACK)

D‑SACK reports duplicate data received, helping the sender distinguish between loss and delay.

ARQ Protocols

TCP implements Automatic Repeat Request (ARQ) via stop‑and‑wait and continuous ARQ. Linux primarily uses continuous ARQ, realized as Go‑Back‑N (with timeout, fast retransmit, and sliding window) or Selective Repeat (with SACK).

TCP Sliding Window

The sliding window controls the sender’s rate and enables batch transmission without waiting for each ACK.

Sender Window

Divided into release area (acknowledged), send window (sent but unacknowledged), and available window (ready to send).

Receiver Window

Consists of acknowledgment area, receive buffer, and cache.

Window Probe

When the receiver advertises a zero window, the sender periodically sends probe packets to discover when the window reopens.

TCP Flow Control

Based on the sliding window, the sender adjusts its transmission rate according to the receiver’s advertised window size, preventing buffer overflow and congestion.

Increase flow: enlarge sender window.

Decrease flow: shrink sender window.

TCP Congestion Control

When network congestion is detected, TCP reduces its sending rate using algorithms such as slow start, congestion avoidance, fast retransmit, and fast recovery.

1. Slow Start

cwnd starts at one MSS and doubles each RTT until it reaches the slow‑start threshold.

2. Congestion Avoidance

cwnd grows linearly: cwnd += 1/cwnd per ACK.

3. Fast Retransmit

Triggered by three duplicate ACKs; cwnd is halved and ssthresh set to the previous cwnd.

4. Fast Recovery

After fast retransmit, cwnd = ssthresh + 3, then grows until a new ACK arrives, after which it returns to congestion avoidance.

TCPThree-way HandshakeFlow ControlSliding WindowNetwork ProtocolCongestion Control
AI Cyberspace
Written by

AI Cyberspace

AI, big data, cloud computing, and networking.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.