Backend Development 15 min read

Understanding the NGINX Process Model and Architecture

This article explains how NGINX’s master, worker, cache loader, and cache manager processes cooperate on a multi‑core server, describing the event‑driven state machine, non‑blocking worker design, configuration reloads, and graceful binary upgrades for high‑performance, scalable web serving.

Architect
Architect
Architect
Understanding the NGINX Process Model and Architecture

Setting the Scene – the NGINX Process Model

To better understand this design, you need to understand how NGINX runs. NGINX has a master process (which performs privileged operations such as reading configuration and binding to ports) and a number of worker and helper processes.

On a 4‑core server, the NGINX master process creates 4 worker processes and a couple of cache helper processes which manage the on‑disk content cache.

Why Is Architecture Important?

The fundamental basis of any Unix application is the thread or process. A thread or process is a self‑contained set of instructions that the operating system can schedule to run on a CPU core. Most complex applications run multiple threads or processes in parallel for two reasons:

They can use more compute cores at the same time.

Threads and processes make it very easy to do operations in parallel (for example, to handle multiple connections at the same time).

Processes and threads consume resources. They each use memory and other OS resources, and they need to be swapped on and off the cores (an operation called a context switch ). Most modern servers can handle hundreds of small, active threads or processes simultaneously, but performance degrades seriously once memory is exhausted or when high I/O load causes a large volume of context switches.

The common way to design network applications is to assign a thread or process to each connection. This architecture is simple and easy to implement, but it does not scale when the application needs to handle thousands of simultaneous connections.

How Does NGINX Work?

NGINX uses a predictable process model that is tuned to the available hardware resources:

The master process performs privileged operations such as reading configuration and binding to ports, and then creates a small number of child processes.

The cache loader process runs at startup to load the disk‑based cache into memory, then exits.

The cache manager process runs periodically and prunes entries from the disk caches to keep them within the configured sizes.

The worker processes do all of the work! They handle network connections, read and write content to disk, and communicate with upstream servers.

The NGINX configuration recommended in most cases – running one worker process per CPU core – makes the most efficient use of hardware resources. You configure it by including the worker_processes auto directive in the configuration:

worker_processes auto;

When an NGINX server is active, only the worker processes are busy. Each worker process handles multiple connections in a non‑blocking fashion, reducing the number of context switches.

Each worker process is single‑threaded and runs independently, grabbing new connections and processing them. The processes can communicate using shared memory for shared cache data, session persistence data, and other shared resources.

Inside the NGINX Worker Process

Each NGINX worker process is initialized with the NGINX configuration and is provided with a set of listen sockets by the master process.

The NGINX worker processes begin by waiting for events on the listen sockets (accept_mutex and kernel socket sharding). Events are initiated by new incoming connections. These connections are assigned to a state machine – the HTTP state machine is the most commonly used, but NGINX also implements state machines for stream (raw TCP) traffic and for several mail protocols (SMTP, IMAP, POP3).

The state machine is essentially the set of instructions that tell NGINX how to process a request. Most web servers that perform the same functions as NGINX use a similar state machine – the difference lies in the implementation.

Scheduling the State Machine

Think of the state machine like the rules for chess. Each HTTP transaction is a chess game. The web server is the grandmaster making rapid decisions, while the remote client (the web browser) is the opponent on a relatively slow network.

A Blocking State Machine

Most web servers and web applications use a process‑per‑connection or thread‑per‑connection model. Each process or thread contains the instructions to play one game through to the end. During its execution, the process spends most of its time ‘blocked’ – waiting for the client to complete its next move.

This architecture is simple and easy to extend with third‑party modules, but it is massively wasteful because a lightweight HTTP connection maps to a heavyweight OS process or thread.

NGINX is a True Grandmaster

Each NGINX worker (usually one per CPU core) is a grandmaster that can play hundreds of thousands of games simultaneously. Workers wait for events on the listen and connection sockets, handle them promptly, and never block on network traffic.

Why Is This Faster than a Blocking, Multi‑Process Architecture?

NGINX scales very well to support hundreds of thousands of connections per worker process. Each new connection creates another file descriptor and consumes a small amount of additional memory in the worker process. Context switches are relatively infrequent and occur only when there is no work to be done.

In a blocking, connection‑per‑process approach, each connection requires a large amount of additional resources and frequent context switches.

With appropriate system tuning, NGINX can handle hundreds of thousands of concurrent HTTP connections per worker process and absorb traffic spikes without missing a beat.

Updating Configuration and Upgrading NGINX

NGINX’s process architecture, with a small number of worker processes, makes updating the configuration and even the NGINX binary itself very efficient.

Updating NGINX configuration is a simple, lightweight, and reliable operation. It typically means running the nginx –s reload command, which checks the configuration on disk and sends the master process a SIGHUP signal.

When the master process receives a SIGHUP, it reloads the configuration and forks a new set of worker processes, while signaling the old workers to gracefully exit after completing their current requests.

This reload can cause a small spike in CPU and memory usage, but it is generally imperceptible compared to the load from active connections. The binary upgrade process follows a similar graceful approach, allowing on‑the‑fly upgrades without dropped connections or downtime.

For a more detailed explanation, see the original article by Andrew Alexeev, VP of Corporate Development and Co‑Founder at NGINX, Inc.

backend architecturescalabilitynginxworker_processesprocess modelconfiguration reload
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.