Understanding HTTP: Versions, Headers, Methods, and Message Flow
This article provides a comprehensive overview of the HTTP protocol, covering its evolution from 0.9 to 2.0, the structure of request and response messages, common methods and status codes, header fields, MIME types, URI/URL distinctions, and the underlying mechanisms of web resource handling and concurrency.
HTTP Protocol Introduction
HTTP: Hyper Text Transfer Protocol is the most widely used network protocol for Web services, transmitting text information formatted as HTML.
HTTP protocol versions
http 0.9 – only transfers HTML documents.
http 1.0
Introduced MIME mechanism, allowing multimedia transmission.
Introduced keep‑alive mechanism (via a header field).
Introduced caching support.
http 1.1 – supports more request methods, finer cache control, and native persistent connections.
http 2.0 – provides optimized transport of HTTP semantics.
spdy – Google’s technology to accelerate HTTP over SSL (less commonly used now).
Currently commonly used versions are http 1.0 and http 1.1.
HTML document introduction
HTML document structure
<html>
<head>
<title>TITLE</title>
</head>
<body>
<h1>H1</h1>
<p></p>
<h2>H2</h2>
<p><a href="admin.html">ToGoogle</a> </p>
</body>
</html>HTML document generation methods
Static – pre‑written files.
Dynamic – generated by programs (PHP, JSP, ASP, .NET) after compilation.
Static and dynamic ways
Static
Dynamic
HTTP protocol
HTTP message format
HTTP messages consist of request messages and response messages.
Request Message – client → server, used to request resources.
Response Message – server → client, used to return resources.
Request message format
Request line + request header + blank line + request entity
<method> 这次请求的方式是什么,也就是请求方法
<request-URL> 请求的是哪个资源,哪个URL。可以是相对路径,如/images/log.jpg,也可以是绝对路径,如http://www.magedu.com/images.banner.jpg
<version> 请求的协议版本是什么,http协议版本,格式HTTP/<major>.<minor>,例如:HTTP/1.0,HTTP/1.1
<HEADERS> 首部,首部可能不止一个。各种所可以使用的首部信息
<entity-body> 请求实体,你到底请求的内容是什么Request line – composed of <method>, <request-URL>, and <version>, separated by spaces.
Request header – name: value pairs.
Blank line – separates header from entity.
Request entity – the payload of the request.
Response message format
Start line + response header + blank line + response entity
<version> 响应时客户端请求的是什么版本,服务器端就需要响应什么版本
<status> 请求的状态码是什么 202,403等
<reason-phrase> 响应的状态码的信息是什么,原因短语,这个状态码所响应的意义,易读信息
<HEADERS> 一大堆的响应首部
<entity-body> 响应体Start line –
<version> <status> <reason-phrase>(e.g., "HTTP/1.1 200 OK").
Response header – name: value pairs (e.g., Content-Type, Content-Length).
Blank line – separates header from entity.
Response entity – body content (text or binary).
HTTP request methods
Common methods include GET, HEAD, POST, PUT, DELETE, MOVE, OPTIONS, TRACE.
HTTP request method
Description
GET
Retrieve specified resource.
HEAD
Same as GET but only returns headers.
POST
Submit data via HTML form, often stored in a database.
PUT
Upload a resource to the server (often stored in the file system).
DELETE
Request server to delete the specified resource.
MOVE
Request server to move a page to another address.
OPTIONS
Query which request methods are supported for a URL.
TRACE
Trace the path of a request through proxies, firewalls, etc.
Commonly used methods are GET, POST, HEAD.
HTTP status codes
Status code
Description
1XX
Informational – indicates provisional responses.
2XX
Success – the request was received, understood, and accepted.
3XX
Redirection – further action needs to be taken to complete the request.
4XX
Client error – the request contains bad syntax or cannot be fulfilled.
5XX
Server error – the server failed to fulfill a valid request.
Common status codes:
Status code
Description
200
OK – request succeeded.
201
Created – resource successfully created.
301
Moved Permanently – resource has a new permanent URL.
302
Found – temporary redirect.
304
Not Modified – resource unchanged.
403
Forbidden – request denied.
404
Not Found – resource does not exist.
405
Method Not Allowed – method not supported.
500
Internal Server Error.
502
Bad Gateway.
503
Service Unavailable.
HTTP header introduction
General header
Request header
Response header
Entity header
Extension header
General header
Connection – defines options such as keep‑alive.
Connection: keep-alive
Cache-Control – fine‑grained cache control (common in HTTP/1.1).
Request header
Client-IP – client IP address.
Host – requested host name.
Referer – original URL of the requested resource.
User-Agent – browser identifier.
Accept – media types the client can accept (including Accept-Charset, Accept-Encoding, Accept-Language).
Conditional request headers – used in HTTP/1.1.
Authorization, Cookie – security‑related headers.
Response header
Age – how long the response can be cached.
Server – software name and version.
Vary – list of request headers that affect the representation.
WWW-Authenticate, Set-Cookie – security‑related headers.
Entity header
Location – new resource location (used with 302).
Allow – allowed request methods.
Content-* – Content-Encoding, Content-Language, Content-Length, Content-Location, Content-Type.
ETag, Expires, Last-Modified – cache‑related headers.
HTTP transaction
An HTTP transaction consists of a request and its corresponding response. By default each transaction opens and closes a new connection, which is inefficient; persistent connections reduce overhead.
HTTP resources
Resources are the content that can be requested via HTTP, such as HTML documents, images, etc.
Resource types are identified by MIME
Format: major/minor.
MIME type
File type
text/html
HTML documents
text/plain
Plain text
image/jpeg
JPEG images
image/gif
GIF images
video/mpeg4
Audio/Video
application/vnd.ms-powerpoint
Dynamic resources
URI and URL
URI – Uniform Resource Identifier, uniquely identifies a resource.
URL – Uniform Resource Locator, describes the location of a resource (scheme, host, path).
CGI
Common Gateway Interface allows a web server to execute external scripts and return their output as part of the HTTP response.
Other knowledge
Specific process of a web resource request
Client enters the URL in the browser.
Browser queries the DNS server to resolve the domain.
TCP three‑way handshake establishes a connection.
Client sends an HTTP request.
Server processes the request.
Server retrieves the requested resource.
Server builds the response message.
Server logs the transaction.
How HTTP handles concurrent requests
By default HTTP works in a blocking model; concurrency is achieved via multi‑process, multi‑thread, or event‑driven models, often using a process pool to reuse idle child processes.
Connection sockets
Client IP, cport ↔ server IP, sport
Listening socket uses the well‑known port (e.g., 80); connection sockets are temporary ports for each client.
Web server I/O structures
Single‑process model – one request at a time.
Multi‑process model – each process handles one request.
Threaded I/O – multiple threads per process.
Event‑driven I/O – a single thread handles many connections.
Process reuse (process pool)
Using a pool of idle child processes reduces the overhead of creating and destroying processes and limits the maximum number of concurrent requests.
Website traffic metrics and concurrency concepts
IP
Unique IP addresses are used to measure traffic; each distinct IP is counted once per day.
PV
Page Views count each request for a page, including refreshes.
UV
Unique Visitors count each distinct client once per day.
Concurrent connections
Maximum number of connections a server can handle per unit time.
Calculating IP, PV, UV, concurrency
IP calculation
Analyze logs to remove duplicate IPs, use third‑party tools, or embed counting code.
PV calculation
Analyze logs to count HTML and dynamic pages, use tools, or embed counting code.
UV calculation
Analyze request headers or use cookies; cookies are more accurate but can be disabled.
Concurrency calculation
Requests per second + concurrent connections + average user think time = total concurrent users.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
