Understanding URL Input, HTTP Protocol, and TCP Connection: Handshake, Teardown, and TIME_WAIT
This article explains the complete process from entering a URL to rendering a web page, covering URL parsing, DNS resolution, TCP three‑way handshake and four‑way teardown, HTTP request/response structure, and browser rendering, while also clarifying TIME_WAIT and related socket states.
URL Parsing
The URL follows the pattern scheme://host.domain:port/path/filename , where scheme denotes the application‑layer protocol (http, https, ftp, etc.), host the domain name, port the service port (80 for http, 443 for https), path the resource location on the server, and filename the actual file name.
DNS Query
Browsers cannot resolve a domain name directly to an IP address; they first check the local host file, then the browser cache, the operating‑system cache, router cache, ISP DNS servers, and finally the root DNS servers, performing recursive queries until the IP address is obtained.
TCP Connection Establishment and Teardown
After DNS resolution, a TCP connection is created using the classic three‑way handshake (SYN → SYN‑ACK → ACK) to establish a reliable full‑duplex link. The article details the state transitions on both client (SYN_SENT, ESTABLISHED, FIN_WAIT_1, FIN_WAIT_2, TIME_WAIT) and server (LISTEN, SYN_RCVD, ESTABLISHED, CLOSE_WAIT, LAST_ACK) sides, explains why TIME_WAIT exists (ensuring the final ACK is received and old duplicate segments disappear), and mentions common issues such as large numbers of sockets in TIME_WAIT or CLOSE_WAIT.
Four‑Way Handshake (Connection Termination)
The termination sequence consists of FIN‑ACK from the client, ACK from the server, FIN‑ACK from the server, and a final ACK from the client, after which the connection enters TIME_WAIT before fully closing.
HTTP Message Format
HTTP operates on top of TCP. An HTTP request consists of a start line (method, URL, version), header fields (key: value pairs), an empty line, and an optional body. Common methods include GET, POST, HEAD, PUT, DELETE, OPTIONS, TRACE, CONNECT . The response follows a similar structure. Headers are case‑insensitive, use hyphens, and must be separated from the body by a CRLF sequence.
Browser Rendering Process
Once the browser receives the HTTP response containing HTML, it parses the DOM tree, builds the CSS rule tree, combines them into a render tree, computes layout information for each node, and finally paints the pixels on the screen.
Additional Notes
The article also provides practical tips for interview preparation, such as mentioning SYN flood attacks, configuring tcp_synack_retries and tcp_max_syn_backlog , and using tcp_tw_reuse / tcp_tw_recycle to mitigate excessive TIME_WAIT sockets.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.