How to Build a High‑Performance Nginx API Gateway from Scratch
This article walks through designing and implementing an enterprise‑grade API gateway with Nginx and OpenResty, covering architecture, dynamic routing, load balancing, rate limiting, circuit breaking, authentication, request transformation, caching strategies, observability, high‑availability deployment, and real‑world performance results.
Building an Enterprise‑Grade API Gateway with Nginx: From Zero to Production
Introduction: Why Your Microservice Architecture Needs a Powerful Gateway
Recent production incidents showed that a sudden flood of requests without rate‑limiting, missing authentication, and scattered logs can cripple services; a unified API gateway solves these problems.
1. Architecture Design: More Than Just a Reverse Proxy
1.1 Overall Architecture
Our gateway is divided into four layers:
├── Access Layer (DNS + CDN)
├── Gateway Layer (Nginx + OpenResty)
├── Service Layer (Microservice cluster)
└── Data Layer (Redis + MySQL + MongoDB)High Availability : Multi‑active deployment with automatic failover
High Performance : Event‑driven Nginx model
Scalable : Lua extensions via OpenResty
Observable : Full monitoring and logging
1.2 Technology Comparison
Solution
Advantages
Disadvantages
Suitable Scenarios
Nginx + OpenResty
Very high performance, stable, mature operations
Feature‑light, requires custom development
High‑concurrency, low‑latency services
Kong
Rich plugins, strong ecosystem
Higher overhead, complex ops
Quick setups for small‑to‑medium services
Spring Cloud Gateway
Java‑friendly, full feature set
Average performance, high resource usage
Java stacks
Envoy
Cloud‑native, powerful features
Steep learning curve, complex config
Kubernetes environments
2. Core Feature Implementation: From Configuration to Code
2.1 Dynamic Routing Configuration
Traditional Nginx needs a reload to apply changes; we use Lua to update routes on the fly.
# nginx.conf core configuration
http {
lua_package_path "/usr/local/openresty/lualib/?.lua;;";
lua_shared_dict routes_cache 100m;
lua_shared_dict upstream_cache 100m;
init_by_lua_block {
local route_manager = require "gateway.route_manager"
route_manager.init()
}
init_worker_by_lua_block {
local route_manager = require "gateway.route_manager"
ngx.timer.every(10, route_manager.sync_routes)
}
server {
listen 80;
server_name api.example.com;
location / {
access_by_lua_block {
local router = require "gateway.router"
router.route()
}
proxy_pass http://$upstream;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Request-Id $request_id;
}
}
}2.2 Intelligent Load Balancing
Beyond round‑robin, we adjust weights based on response time, error rate, and static weight.
local function get_server(upstream_name)
local servers_key = "servers:" .. upstream_name
local servers_data = upstream_cache:get(servers_key)
if not servers_data then return nil end
local servers = cjson.decode(servers_data)
local total_weight = 0
local weighted_servers = {}
for _, server in ipairs(servers) do
local stats_key = "stats:" .. server.host .. ":" .. server.port
local stats = upstream_cache:get(stats_key)
local weight = 1000 / (stats.avg_response_time + 1)
weight = weight * (1 - stats.error_rate) * server.weight
table.insert(weighted_servers, {server = server, weight = weight, range_start = total_weight, range_end = total_weight + weight})
total_weight = total_weight + weight
end
local random_weight = math.random() * total_weight
for _, ws in ipairs(weighted_servers) do
if random_weight >= ws.range_start and random_weight < ws.range_end then
return ws.server
end
end
return servers[1]
end2.3 Rate Limiting and Circuit Breaking
We implement a token‑bucket algorithm with Redis and a simple circuit‑breaker that opens after ten failures within ten seconds.
-- rate limiter (Lua)
local token_bucket_limit = function(key, rate, capacity)
local red = redis:new()
red:set_timeout(1000)
local ok, err = red:connect("127.0.0.1", 6379)
if not ok then return true end
local script = [[
local key = KEYS[1]
local rate = tonumber(ARGV[1])
local capacity = tonumber(ARGV[2])
local now = tonumber(ARGV[3])
local requested = tonumber(ARGV[4] or 1)
local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
local tokens = tonumber(bucket[1] or capacity)
local last_refill = tonumber(bucket[2] or now)
local elapsed = math.max(0, now - last_refill)
local tokens_to_add = elapsed * rate
tokens = math.min(capacity, tokens + tokens_to_add)
if tokens >= requested then
tokens = tokens - requested
redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
redis.call('EXPIRE', key, capacity / rate + 1)
return 1
else
redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
redis.call('EXPIRE', key, capacity / rate + 1)
return 0
end
]]
local now = ngx.now()
local res = red:eval(script, 1, key, rate, capacity, now, 1)
red:set_keepalive(10000, 100)
return res == 1
end
-- circuit breaker (Lua)
local breaker_key = "breaker:" .. service_name
local state = breaker_cache:get(breaker_key .. ":state") or "closed"
if state == "open" then
local open_time = breaker_cache:get(breaker_key .. ":open_time")
if ngx.now() - open_time > 30 then
breaker_cache:set(breaker_key .. ":state", "half_open")
state = "half_open"
else
return false, "Circuit breaker is open"
end
end
if state == "half_open" then
local count = breaker_cache:incr(breaker_key .. ":half_open_count", 1, 0)
if count > 5 then return false, "Circuit breaker half‑open limit exceeded" end
end
return true2.4 Unified Authentication and Authorization
JWT verification, blacklist checking, permission matching with wildcard support, and request signing to prevent replay attacks.
-- JWT verification (Lua)
local auth_header = ngx.var.http_authorization
if not auth_header then return false, "Missing authorization header" end
local _, _, token = string.find(auth_header, "Bearer%s+(.+)")
if not token then return false, "Invalid authorization header format" end
local jwt_secret = os.getenv("JWT_SECRET") or "your-secret-key"
local jwt_obj = jwt:verify(jwt_secret, token)
if not jwt_obj.verified then return false, jwt_obj.reason end
local red = redis:new()
red:set_timeout(1000)
local ok, err = red:connect("127.0.0.1", 6379)
if ok then
local blacklisted = red:get("blacklist:" .. token)
if blacklisted then return false, "Token has been revoked" end
red:set_keepalive(10000, 100)
end
ngx.ctx.user = jwt_obj.payload
return true2.5 Request/Response Transformation
We support API version conversion, GraphQL‑to‑REST mapping, and automatic header injection.
-- request version conversion
local version = ngx.var.http_x_api_version or "v2"
if version == "v1" then transform_v1_to_v2_request() end
ngx.req.set_header("X-Request-Id", ngx.var.request_id)
ngx.req.set_header("X-Gateway-Time", ngx.now())
ngx.req.set_header("X-Forwarded-Host", ngx.var.host)
ngx.req.set_header("X-Forwarded-Proto", ngx.var.scheme)3. Performance Optimizations: Making the Gateway Fly
3.1 Multi‑Level Caching
Local Nginx cache plus Redis fallback with stale‑while‑revalidate logic.
# Nginx cache definition
proxy_cache_path /var/cache/nginx/api_cache levels=1:2 keys_zone=api_cache:100m max_size=10g inactive=60m use_temp_path=off;
location /api/ {
set $cache_key "$scheme$request_method$host$request_uri$is_args$args";
access_by_lua_block {
local cache = require "gateway.cache"
cache.handle_cache()
}
proxy_cache api_cache;
proxy_cache_key $cache_key;
proxy_cache_valid 200 304 5m;
proxy_cache_valid 404 1m;
proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
proxy_cache_background_update on;
proxy_cache_lock on;
proxy_cache_lock_timeout 5s;
add_header X-Cache-Status $upstream_cache_status;
proxy_pass http://backend;
}
# Lua cache handler (simplified)
local function handle_cache()
if ngx.var.request_method ~= "GET" then return end
local cache_key = generate_cache_key()
local cached = get_from_redis(cache_key)
if cached and not is_stale(cached) then
ngx.header["Content-Type"] = cached.content_type
ngx.header["X-Cache-Hit"] = "redis"
ngx.say(cached.body)
return ngx.exit(200)
end
ngx.ctx.cache_key = cache_key
ngx.ctx.should_cache = true
end3.2 Connection‑Pool Tuning
Keep‑alive settings, least‑conn load balancing, and timeout tuning for high concurrency.
upstream backend {
server 192.168.1.10:8080 max_fails=2 fail_timeout=10s;
server 192.168.1.11:8080 max_fails=2 fail_timeout=10s;
server 192.168.1.12:8080 max_fails=2 fail_timeout=10s backup;
least_conn;
keepalive 256;
keepalive_requests 1000;
keepalive_timeout 60s;
}
http {
keepalive_timeout 65;
keepalive_requests 100;
proxy_connect_timeout 5s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
proxy_buffer_size 32k;
proxy_buffers 4 64k;
proxy_busy_buffers_size 128k;
proxy_temp_file_write_size 256k;
http2_max_field_size 16k;
http2_max_header_size 32k;
proxy_http_version 1.1;
proxy_set_header Connection "";
}3.3 Memory Management
Periodic cleanup of expired cache entries and LRU eviction when memory usage exceeds thresholds.
local function cleanup_expired_cache()
local dict = ngx.shared.routes_cache
local keys = dict:get_keys(0)
for _, key in ipairs(keys) do
if dict:ttl(key) and dict:ttl(key) < 10 then
dict:delete(key)
end
end
end
local function monitor_memory()
local caches = {"routes_cache", "upstream_cache", "permissions_cache", "breaker_cache"}
for _, name in ipairs(caches) do
local c = ngx.shared[name]
if c then
local usage = (c:capacity() - c:free_space()) / c:capacity()
if usage > 0.8 then
ngx.log(ngx.WARN, string.format("Memory usage warning: %s %.2f%% full", name, usage*100))
c:flush_expired()
local keys = c:get_keys(0)
local to_del = math.floor(#keys * 0.1)
for i=1,to_del do c:delete(keys[i]) end
end
end
end
end
ngx.timer.every(60, cleanup_expired_cache)
ngx.timer.every(300, monitor_memory)4. Observability: Logging, Metrics, and Health Checks
4.1 Structured Logging to Kafka
Log request/response details as JSON; fallback to local file on failure.
local log_data = {
timestamp = ngx.now(),
request_id = ngx.var.request_id,
method = ngx.var.request_method,
uri = ngx.var.uri,
args = ngx.var.args,
host = ngx.var.host,
client_ip = ngx.var.remote_addr,
user_agent = ngx.var.http_user_agent,
referer = ngx.var.http_referer,
status = ngx.var.status,
bytes_sent = ngx.var.bytes_sent,
request_time = ngx.var.request_time,
upstream_response_time = ngx.var.upstream_response_time,
upstream_addr = ngx.var.upstream_addr,
upstream_status = ngx.var.upstream_status,
cache_status = ngx.var.upstream_cache_status,
user_id = ngx.ctx.user and ngx.ctx.user.user_id,
trace_id = ngx.var.http_x_trace_id,
span_id = ngx.var.http_x_span_id,
}
local ok, err = producer:send("gateway-logs", nil, cjson.encode(log_data))
if not ok then write_local_log(log_data) end4.2 Prometheus Metrics
Counters for requests, histograms for latency, gauges for active connections, and circuit‑breaker state.
local prometheus = require "nginx.prometheus"
prometheus.init("prometheus_metrics")
local request_count = prometheus:counter("gateway_requests_total", "Total number of requests", {"method", "path", "status"})
local request_duration = prometheus:histogram("gateway_request_duration_seconds", "Request duration in seconds", {"method", "path"})
local upstream_duration = prometheus:histogram("gateway_upstream_duration_seconds", "Upstream response time", {"upstream", "method", "path"})
local active_connections = prometheus:gauge("gateway_active_connections", "Number of active connections")
local rate_limit_hits = prometheus:counter("gateway_rate_limit_hits_total", "Number of rate limit hits", {"client", "rule"})
local circuit_breaker_state = prometheus:gauge("gateway_circuit_breaker_state", "Circuit breaker state (0=closed,1=open,2=half‑open)", {"service"})
local function log()
local method = ngx.var.request_method
local path = ngx.var.uri
local status = ngx.var.status
request_count:inc(1, {method, path, status})
request_duration:observe(tonumber(ngx.var.request_time) or 0, {method, path})
upstream_duration:observe(tonumber(ngx.var.upstream_response_time) or 0, {ngx.var.upstream_addr, method, path})
active_connections:set(ngx.var.connections_active)
end4.3 Health Check Endpoints
Combined Redis, upstream, and memory health checks returning JSON status.
-- health check (Lua)
local function check()
local checks = {}
local healthy = true
local redis = require "gateway.health"
local r = redis.check_redis()
checks.redis = r
healthy = healthy and r.healthy
local u = redis.check_upstreams()
checks.upstreams = u
healthy = healthy and u.healthy
local m = redis.check_memory()
checks.memory = m
healthy = healthy and m.healthy
ngx.status = healthy and 200 or 503
ngx.header["Content-Type"] = "application/json"
ngx.say(cjson.encode({status = healthy and "UP" or "DOWN", timestamp = ngx.now(), checks = checks}))
end5. High‑Availability Deployment
5.1 Multi‑Active Architecture
Docker‑Compose defines two gateway nodes, HAProxy load balancer, Redis cluster, Consul service discovery, and Prometheus/Grafana monitoring.
version: '3.8'
services:
gateway-1:
image: openresty/openresty:alpine
volumes:
- ./conf/nginx.conf:/usr/local/openresty/nginx/conf/nginx.conf
- ./lua:/usr/local/openresty/lualib/gateway
ports:
- "8080:80"
environment:
- GATEWAY_NODE_ID=node-1
- REDIS_HOST=redis
- CONSUL_HOST=consul
depends_on: [redis, consul]
gateway-2:
image: openresty/openresty:alpine
volumes:
- ./conf/nginx.conf:/usr/local/openresty/nginx/conf/nginx.conf
- ./lua:/usr/local/openresty/lualib/gateway
ports:
- "8081:80"
environment:
- GATEWAY_NODE_ID=node-2
- REDIS_HOST=redis
- CONSUL_HOST=consul
depends_on: [redis, consul]
haproxy:
image: haproxy:2.4-alpine
volumes:
- ./conf/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
ports:
- "80:80"
- "443:443"
depends_on: [gateway-1, gateway-2]
redis:
image: redis:6-alpine
command: redis-server --appendonly yes
ports: ["6379:6379"]
consul:
image: consul:1.10
command: agent -server -bootstrap-expect=1 -ui -client=0.0.0.0
ports: ["8500:8500", "8600:8600/udp"]
prometheus:
image: prom/prometheus:latest
volumes:
- ./conf/prometheus.yml:/etc/prometheus/prometheus.yml
ports: ["9090:9090"]
grafana:
image: grafana/grafana:latest
ports: ["3000:3000"]
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin5.2 Canary Release Strategy
Routing decisions based on headers, cookies, user ID, or traffic percentage, with APIs to adjust canary rules at runtime.
local function route_canary()
local uri = ngx.var.uri
local hdr = ngx.req.get_headers()
if hdr["X-Canary"] == "true" then return get_canary_upstream() end
if ngx.var.cookie_canary == "true" then return get_canary_upstream() end
local user = ngx.ctx.user
if user and is_canary_user(user.user_id) then return get_canary_upstream() end
local pct = get_canary_percentage(uri)
if pct > 0 and math.random(100) <= pct then return get_canary_upstream() end
return get_stable_upstream()
end6. Real‑World Results and Lessons Learned
6.1 Performance Test Results
Metric
Value
Test Conditions
QPS
100,000+
8‑core 16 GB single node
P99 Latency
< 10 ms
Excludes backend processing
P95 Latency
< 5 ms
Excludes backend processing
CPU Utilization
40‑60 %
Peak load
Memory Usage
2‑4 GB
Including cache
Active Connections
50,000+
Concurrent connections
6.2 Failure Cases
Case 1: Downstream Service Avalanche – Circuit breaker opened automatically, returning degraded responses and keeping overall availability at 99.9 %.
Case 2: DDoS Attack – Multi‑layer rate limiting and IP blacklist blocked millions of malicious requests with zero impact on legitimate traffic.
Conclusion
Building an API gateway with Nginx and OpenResty is a system‑level effort that touches architecture design, feature implementation, performance tuning, high‑availability deployment, and observability. The presented solution has been battle‑tested in production for years, handling billions of daily requests while providing a solid foundation for secure, scalable microservice communication.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
