Backend Development 21 min read

Mastering Nginx Access Layer Rate Limiting: Practical Configurations and Code Samples

This article explains how to implement access‑layer rate limiting in Nginx using built‑in modules, OpenResty Lua extensions, and token‑bucket or leaky‑bucket algorithms, with detailed configuration snippets, execution flow, testing procedures, and log analysis for robust traffic control.

21CTO

Jun 14, 2016

Access Layer Rate Limiting

The access layer is the entry point for request traffic and is responsible for load balancing, illegal request filtering, request aggregation, caching, degradation, rate limiting, A/B testing, and service quality monitoring.

ngx_http_limit_conn_module

The limit_conn directive limits the total number of concurrent connections for a given KEY (e.g., IP address or domain). Only connections that have been fully read (request header processed) are counted.

Configuration example:

http {
    limit_conn_zone $binary_remote_addr zone=addr:10m;
    limit_conn_log_level error;
    limit_conn_status 503;
    ...
    server {
        ...
        location /limit {
            limit_conn addr 1;
        }
    }
}

Key directives: limit_conn_zone: defines the shared memory zone and the key used for counting. limit_conn: sets the maximum concurrent connections for the specified key. limit_conn_status: HTTP status code returned when the limit is exceeded (default 503). limit_conn_log_level: log level for limit events (default error).

Execution flow:

When a request arrives, Nginx checks the current connection count in the zone.

If the count exceeds the configured maximum, the request is rejected with limit_conn_status.

Otherwise the count is incremented and a callback is registered to decrement the count after the request finishes.

ngx_http_limit_req_module

The limit_req directive implements a token‑bucket algorithm to limit request rate for a given KEY (commonly the client IP). It supports burst capacity and delayed or non‑delayed processing.

Configuration example:

http {
    limit_req_zone $binary_remote_addr zone=one:10m rate=1r/s;
    limit_conn_log_level error;
    limit_conn_status 503;
    ...
    server {
        ...
        location /limit {
            limit_req zone=one burst=5 nodelay;
        }
    }
}

Key directives: limit_req_zone: defines the shared memory zone, key, and fixed request rate. limit_req: sets the zone, burst size, and whether to delay excess requests. limit_req_status and limit_req_log_level behave like their limit_conn counterparts.

Execution flow:

The request timestamp is compared with the last request time to decide if it should be limited.

If no burst is configured, excess requests are immediately rejected with the configured status code.

If a burst is configured, excess requests are either queued (delayed) or processed immediately when nodelay is set.

After processing, Nginx may perform cleanup of the limit keys.

Testing with ab

Typical test commands:

ab -n 5 -c 5 http://localhost/limit
ab -c 6 -n 6 http://localhost/limit

Log output shows a mix of 200 (allowed) and 503 (limited) responses, demonstrating the effect of the configured limits.

Lua‑based Dynamic Limiting (lua‑resty‑limit‑traffic)

For more complex, dynamic policies, OpenResty’s lua‑resty‑limit‑traffic module can be used. Example Lua script ( limit_req.lua) creates a limiter with a fixed rate of 2 r/s and a burst of 3, then applies it in the access phase:

local limit_req = require "resty.limit.req"
local rate = 2
local burst = 3
local error_status = 503
local lim, err = limit_req.new("limit_req_store", rate, burst)
if not lim then ngx.exit(error_status) end
local key = ngx.var.binary_remote_addr
local delay, err = lim:incoming(key, true)
if not delay and err == "rejected" then ngx.exit(error_status) end
if delay > 0 then ngx.sleep(delay) end

The script can be combined with lua_shared_dict limit_req_store 100m; in the Nginx configuration.

Additional Tips

Use limit_rate to throttle download speed (e.g., limit_rate 50k;).

When memory for limit_conn_zone or limit_req_zone is insufficient, all subsequent requests will be limited.

In multi‑node deployments, consistent hashing or a distributed Lua‑based limiter can provide global rate limiting.

Overall, the choice of algorithm (token bucket vs. leaky bucket) and parameters (burst size, delay mode) should be driven by the specific business scenario; the most sophisticated algorithm is not always the most appropriate.