Understanding Nginx: Core Architecture, Modules, and How to Dive into the Source Code
This article provides a systematic overview of Nginx’s high‑performance architecture, covering its modular design, master/worker process model, event‑driven loop, HTTP request processing phases, location matching rules, upstream load‑balancing, FastCGI and proxy configurations, rate‑limiting mechanisms, and common 502 error troubleshooting, while offering practical tips for reading the source code.
Nginx is the most widely used web server, powering roughly one in three websites worldwide; understanding its internals is essential for developers who need to configure, debug, or extend it.
How to start reading the Nginx source – Begin with the main() function and use GDB extensively ( b, p, bt, c, n) to step through the code. Adopt a problem‑driven approach: identify a concrete question (e.g., how the event loop works) and locate the relevant function such as ngx_process_events_and_timers() using the official development guide.
Modular programming – Nginx’s functionality is split into modules (core, event, HTTP, etc.). Each module defines a ngx_module_s structure with hook callbacks (e.g., init_master, init_process), a command array for configuration parsing, and a context pointer for module‑specific data. Modules register their handlers into the global phases array.
Master/worker process model – The master process forks worker processes, monitors them via SIGCHLD, and respawns any that exit. Experiments show that killing the master terminates all workers, while killing a worker causes the master to start a new one. Signals such as SIGUSR1, SIGHUP, SIGQUIT, and SIGTERM control log reopening, configuration reload, graceful shutdown, and immediate stop respectively.
Event‑driven model – Workers run an infinite loop that finds the nearest timer, calls epoll_wait(), processes I/O events, and then handles timers. The core loop is implemented in ngx_worker_process_cycle() and ngx_process_events_and_timers():
for (;;) {
// find nearest timer
timer = ngx_event_find_timer();
// lock and add listen_fd to epoll
ngx_trylock_accept_mutex(cycle);
// wait for I/O events
(void) ngx_process_events(cycle, timer, flags);
// process posted events
ngx_event_process_posted(cycle, &ngx_posted_events);
}HTTP processing phases – Nginx defines 11 phases (e.g., NGX_HTTP_REWRITE_PHASE, NGX_HTTP_ACCESS_PHASE, NGX_HTTP_CONTENT_PHASE). Modules register handlers for appropriate phases; the content phase ultimately generates the response, with special handling for proxy_pass and fastcgi_pass which set a custom content_handler.
Location matching – The location directive supports exact ( =), prefix ( ^~), regex ( ~, ~*), and named ( @) matches. Nginx builds a three‑way tree for prefix locations to achieve O(log n) lookup, while regex locations remain in a separate array.
Upstream and load balancing – The ngx_http_upstream_s structure holds connection handlers, peer information, and callbacks for request creation, header processing, and finalization. Default load‑balancing is round‑robin ( ngx_http_upstream_init_round_robin), with alternatives such as ip_hash, hash, and least_conn. Configuration directives like proxy_next_upstream and proxy_next_upstream_tries control retry behavior.
FastCGI protocol – FastCGI messages start with an 8‑byte header ( ngx_http_fastcgi_header_t) followed by type‑specific payloads. Example header struct:
typedef struct {
u_char version; // protocol version
u_char type; // message type
u_char request_id_hi;
u_char request_id_lo;
u_char content_length_hi;
u_char content_length_lo;
u_char padding_length;
u_char reserved;
} ngx_http_fastcgi_header_t;Typical message types are defined as NGX_HTTP_FASTCGI_BEGIN_REQUEST, NGX_HTTP_FASTCGI_PARAMS, NGX_HTTP_FASTCGI_STDIN, NGX_HTTP_FASTCGI_STDOUT, and NGX_HTTP_FASTCGI_END_REQUEST.
Proxy_pass configuration – To enable HTTP keep‑alive with upstream servers, define an upstream block with keepalive and set proxy_http_version 1.1 plus proxy_set_header Connection "keep-alive". Be aware that upstream servers may close idle connections, causing occasional 502 errors.
Rate limiting – Nginx provides ngx_http_limit_req_module (token‑bucket algorithm) and ngx_http_limit_conn_module (concurrent connection limit). Configuration example:
limit_req_zone $binary_remote_addr zone=one:10m rate=1r/s;
limit_req zone=one burst=5 nodelay;Experiments with ab show how burst and nodelay affect request queuing and response times.
502 error troubleshooting – Common causes include upstream not listening, upstream closing the connection (e.g., due to timeout), or a full Unix socket listen queue returning EAGAIN. Logs such as “upstream prematurely closed connection while reading response header” pinpoint the failure point.
Conclusion – While this article cannot cover every Nginx detail, it outlines the major design concepts and points readers to the relevant source files and data structures for deeper exploration.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Xueersi Online School Tech Team
The Xueersi Online School Tech Team, dedicated to innovating and promoting internet education technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
