Operations 37 min read

How to Build a High‑Performance Nginx API Gateway from Scratch

This article walks through designing and implementing an enterprise‑grade API gateway with Nginx and OpenResty, covering architecture, dynamic routing, load balancing, rate limiting, circuit breaking, authentication, request transformation, caching strategies, observability, high‑availability deployment, and real‑world performance results.

MaGe Linux Operations

Aug 30, 2025

How to Build a High‑Performance Nginx API Gateway from Scratch

Building an Enterprise‑Grade API Gateway with Nginx: From Zero to Production

Introduction: Why Your Microservice Architecture Needs a Powerful Gateway

Recent production incidents showed that a sudden flood of requests without rate‑limiting, missing authentication, and scattered logs can cripple services; a unified API gateway solves these problems.

1. Architecture Design: More Than Just a Reverse Proxy

1.1 Overall Architecture

Our gateway is divided into four layers:

├── Access Layer (DNS + CDN)
├── Gateway Layer (Nginx + OpenResty)
├── Service Layer (Microservice cluster)
└── Data Layer (Redis + MySQL + MongoDB)

High Availability : Multi‑active deployment with automatic failover

High Performance : Event‑driven Nginx model

Scalable : Lua extensions via OpenResty

Observable : Full monitoring and logging

1.2 Technology Comparison

Solution

Advantages

Disadvantages

Suitable Scenarios

Nginx + OpenResty

Very high performance, stable, mature operations

Feature‑light, requires custom development

High‑concurrency, low‑latency services

Kong

Rich plugins, strong ecosystem

Higher overhead, complex ops

Quick setups for small‑to‑medium services

Spring Cloud Gateway

Java‑friendly, full feature set

Average performance, high resource usage

Java stacks

Envoy

Cloud‑native, powerful features

Steep learning curve, complex config

Kubernetes environments

2. Core Feature Implementation: From Configuration to Code

2.1 Dynamic Routing Configuration

Traditional Nginx needs a reload to apply changes; we use Lua to update routes on the fly.

# nginx.conf core configuration
http {
    lua_package_path "/usr/local/openresty/lualib/?.lua;;";
    lua_shared_dict routes_cache 100m;
    lua_shared_dict upstream_cache 100m;

    init_by_lua_block {
        local route_manager = require "gateway.route_manager"
        route_manager.init()
    }

    init_worker_by_lua_block {
        local route_manager = require "gateway.route_manager"
        ngx.timer.every(10, route_manager.sync_routes)
    }

    server {
        listen 80;
        server_name api.example.com;

        location / {
            access_by_lua_block {
                local router = require "gateway.router"
                router.route()
            }
            proxy_pass http://$upstream;
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Request-Id $request_id;
        }
    }
}

2.2 Intelligent Load Balancing

Beyond round‑robin, we adjust weights based on response time, error rate, and static weight.

local function get_server(upstream_name)
    local servers_key = "servers:" .. upstream_name
    local servers_data = upstream_cache:get(servers_key)
    if not servers_data then return nil end
    local servers = cjson.decode(servers_data)
    local total_weight = 0
    local weighted_servers = {}
    for _, server in ipairs(servers) do
        local stats_key = "stats:" .. server.host .. ":" .. server.port
        local stats = upstream_cache:get(stats_key)
        local weight = 1000 / (stats.avg_response_time + 1)
        weight = weight * (1 - stats.error_rate) * server.weight
        table.insert(weighted_servers, {server = server, weight = weight, range_start = total_weight, range_end = total_weight + weight})
        total_weight = total_weight + weight
    end
    local random_weight = math.random() * total_weight
    for _, ws in ipairs(weighted_servers) do
        if random_weight >= ws.range_start and random_weight < ws.range_end then
            return ws.server
        end
    end
    return servers[1]
end

2.3 Rate Limiting and Circuit Breaking

We implement a token‑bucket algorithm with Redis and a simple circuit‑breaker that opens after ten failures within ten seconds.

-- rate limiter (Lua)
local token_bucket_limit = function(key, rate, capacity)
    local red = redis:new()
    red:set_timeout(1000)
    local ok, err = red:connect("127.0.0.1", 6379)
    if not ok then return true end
    local script = [[
        local key = KEYS[1]
        local rate = tonumber(ARGV[1])
        local capacity = tonumber(ARGV[2])
        local now = tonumber(ARGV[3])
        local requested = tonumber(ARGV[4] or 1)
        local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
        local tokens = tonumber(bucket[1] or capacity)
        local last_refill = tonumber(bucket[2] or now)
        local elapsed = math.max(0, now - last_refill)
        local tokens_to_add = elapsed * rate
        tokens = math.min(capacity, tokens + tokens_to_add)
        if tokens >= requested then
            tokens = tokens - requested
            redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
            redis.call('EXPIRE', key, capacity / rate + 1)
            return 1
        else
            redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
            redis.call('EXPIRE', key, capacity / rate + 1)
            return 0
        end
    ]]
    local now = ngx.now()
    local res = red:eval(script, 1, key, rate, capacity, now, 1)
    red:set_keepalive(10000, 100)
    return res == 1
end

-- circuit breaker (Lua)
local breaker_key = "breaker:" .. service_name
local state = breaker_cache:get(breaker_key .. ":state") or "closed"
if state == "open" then
    local open_time = breaker_cache:get(breaker_key .. ":open_time")
    if ngx.now() - open_time > 30 then
        breaker_cache:set(breaker_key .. ":state", "half_open")
        state = "half_open"
    else
        return false, "Circuit breaker is open"
    end
end
if state == "half_open" then
    local count = breaker_cache:incr(breaker_key .. ":half_open_count", 1, 0)
    if count > 5 then return false, "Circuit breaker half‑open limit exceeded" end
end
return true

2.4 Unified Authentication and Authorization

JWT verification, blacklist checking, permission matching with wildcard support, and request signing to prevent replay attacks.

-- JWT verification (Lua)
local auth_header = ngx.var.http_authorization
if not auth_header then return false, "Missing authorization header" end
local _, _, token = string.find(auth_header, "Bearer%s+(.+)")
if not token then return false, "Invalid authorization header format" end
local jwt_secret = os.getenv("JWT_SECRET") or "your-secret-key"
local jwt_obj = jwt:verify(jwt_secret, token)
if not jwt_obj.verified then return false, jwt_obj.reason end
local red = redis:new()
red:set_timeout(1000)
local ok, err = red:connect("127.0.0.1", 6379)
if ok then
    local blacklisted = red:get("blacklist:" .. token)
    if blacklisted then return false, "Token has been revoked" end
    red:set_keepalive(10000, 100)
end
ngx.ctx.user = jwt_obj.payload
return true

2.5 Request/Response Transformation

We support API version conversion, GraphQL‑to‑REST mapping, and automatic header injection.

-- request version conversion
local version = ngx.var.http_x_api_version or "v2"
if version == "v1" then transform_v1_to_v2_request() end
ngx.req.set_header("X-Request-Id", ngx.var.request_id)
ngx.req.set_header("X-Gateway-Time", ngx.now())
ngx.req.set_header("X-Forwarded-Host", ngx.var.host)
ngx.req.set_header("X-Forwarded-Proto", ngx.var.scheme)

3. Performance Optimizations: Making the Gateway Fly

3.1 Multi‑Level Caching

Local Nginx cache plus Redis fallback with stale‑while‑revalidate logic.

# Nginx cache definition
proxy_cache_path /var/cache/nginx/api_cache levels=1:2 keys_zone=api_cache:100m max_size=10g inactive=60m use_temp_path=off;

location /api/ {
    set $cache_key "$scheme$request_method$host$request_uri$is_args$args";
    access_by_lua_block {
        local cache = require "gateway.cache"
        cache.handle_cache()
    }
    proxy_cache api_cache;
    proxy_cache_key $cache_key;
    proxy_cache_valid 200 304 5m;
    proxy_cache_valid 404 1m;
    proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504;
    proxy_cache_background_update on;
    proxy_cache_lock on;
    proxy_cache_lock_timeout 5s;
    add_header X-Cache-Status $upstream_cache_status;
    proxy_pass http://backend;
}

# Lua cache handler (simplified)
local function handle_cache()
    if ngx.var.request_method ~= "GET" then return end
    local cache_key = generate_cache_key()
    local cached = get_from_redis(cache_key)
    if cached and not is_stale(cached) then
        ngx.header["Content-Type"] = cached.content_type
        ngx.header["X-Cache-Hit"] = "redis"
        ngx.say(cached.body)
        return ngx.exit(200)
    end
    ngx.ctx.cache_key = cache_key
    ngx.ctx.should_cache = true
end

3.2 Connection‑Pool Tuning

Keep‑alive settings, least‑conn load balancing, and timeout tuning for high concurrency.

upstream backend {
    server 192.168.1.10:8080 max_fails=2 fail_timeout=10s;
    server 192.168.1.11:8080 max_fails=2 fail_timeout=10s;
    server 192.168.1.12:8080 max_fails=2 fail_timeout=10s backup;
    least_conn;
    keepalive 256;
    keepalive_requests 1000;
    keepalive_timeout 60s;
}

http {
    keepalive_timeout 65;
    keepalive_requests 100;
    proxy_connect_timeout 5s;
    proxy_send_timeout 60s;
    proxy_read_timeout 60s;
    proxy_buffer_size 32k;
    proxy_buffers 4 64k;
    proxy_busy_buffers_size 128k;
    proxy_temp_file_write_size 256k;
    http2_max_field_size 16k;
    http2_max_header_size 32k;
    proxy_http_version 1.1;
    proxy_set_header Connection "";
}

3.3 Memory Management

Periodic cleanup of expired cache entries and LRU eviction when memory usage exceeds thresholds.

local function cleanup_expired_cache()
    local dict = ngx.shared.routes_cache
    local keys = dict:get_keys(0)
    for _, key in ipairs(keys) do
        if dict:ttl(key) and dict:ttl(key) < 10 then
            dict:delete(key)
        end
    end
end

local function monitor_memory()
    local caches = {"routes_cache", "upstream_cache", "permissions_cache", "breaker_cache"}
    for _, name in ipairs(caches) do
        local c = ngx.shared[name]
        if c then
            local usage = (c:capacity() - c:free_space()) / c:capacity()
            if usage > 0.8 then
                ngx.log(ngx.WARN, string.format("Memory usage warning: %s %.2f%% full", name, usage*100))
                c:flush_expired()
                local keys = c:get_keys(0)
                local to_del = math.floor(#keys * 0.1)
                for i=1,to_del do c:delete(keys[i]) end
            end
        end
    end
end

ngx.timer.every(60, cleanup_expired_cache)
ngx.timer.every(300, monitor_memory)

4. Observability: Logging, Metrics, and Health Checks

4.1 Structured Logging to Kafka

Log request/response details as JSON; fallback to local file on failure.

local log_data = {
    timestamp = ngx.now(),
    request_id = ngx.var.request_id,
    method = ngx.var.request_method,
    uri = ngx.var.uri,
    args = ngx.var.args,
    host = ngx.var.host,
    client_ip = ngx.var.remote_addr,
    user_agent = ngx.var.http_user_agent,
    referer = ngx.var.http_referer,
    status = ngx.var.status,
    bytes_sent = ngx.var.bytes_sent,
    request_time = ngx.var.request_time,
    upstream_response_time = ngx.var.upstream_response_time,
    upstream_addr = ngx.var.upstream_addr,
    upstream_status = ngx.var.upstream_status,
    cache_status = ngx.var.upstream_cache_status,
    user_id = ngx.ctx.user and ngx.ctx.user.user_id,
    trace_id = ngx.var.http_x_trace_id,
    span_id = ngx.var.http_x_span_id,
}
local ok, err = producer:send("gateway-logs", nil, cjson.encode(log_data))
if not ok then write_local_log(log_data) end

4.2 Prometheus Metrics

Counters for requests, histograms for latency, gauges for active connections, and circuit‑breaker state.

local prometheus = require "nginx.prometheus"
prometheus.init("prometheus_metrics")
local request_count = prometheus:counter("gateway_requests_total", "Total number of requests", {"method", "path", "status"})
local request_duration = prometheus:histogram("gateway_request_duration_seconds", "Request duration in seconds", {"method", "path"})
local upstream_duration = prometheus:histogram("gateway_upstream_duration_seconds", "Upstream response time", {"upstream", "method", "path"})
local active_connections = prometheus:gauge("gateway_active_connections", "Number of active connections")
local rate_limit_hits = prometheus:counter("gateway_rate_limit_hits_total", "Number of rate limit hits", {"client", "rule"})
local circuit_breaker_state = prometheus:gauge("gateway_circuit_breaker_state", "Circuit breaker state (0=closed,1=open,2=half‑open)", {"service"})

local function log()
    local method = ngx.var.request_method
    local path = ngx.var.uri
    local status = ngx.var.status
    request_count:inc(1, {method, path, status})
    request_duration:observe(tonumber(ngx.var.request_time) or 0, {method, path})
    upstream_duration:observe(tonumber(ngx.var.upstream_response_time) or 0, {ngx.var.upstream_addr, method, path})
    active_connections:set(ngx.var.connections_active)
end

4.3 Health Check Endpoints

Combined Redis, upstream, and memory health checks returning JSON status.

-- health check (Lua)
local function check()
    local checks = {}
    local healthy = true
    local redis = require "gateway.health"
    local r = redis.check_redis()
    checks.redis = r
    healthy = healthy and r.healthy
    local u = redis.check_upstreams()
    checks.upstreams = u
    healthy = healthy and u.healthy
    local m = redis.check_memory()
    checks.memory = m
    healthy = healthy and m.healthy
    ngx.status = healthy and 200 or 503
    ngx.header["Content-Type"] = "application/json"
    ngx.say(cjson.encode({status = healthy and "UP" or "DOWN", timestamp = ngx.now(), checks = checks}))
end

5. High‑Availability Deployment

5.1 Multi‑Active Architecture

Docker‑Compose defines two gateway nodes, HAProxy load balancer, Redis cluster, Consul service discovery, and Prometheus/Grafana monitoring.

version: '3.8'
services:
  gateway-1:
    image: openresty/openresty:alpine
    volumes:
      - ./conf/nginx.conf:/usr/local/openresty/nginx/conf/nginx.conf
      - ./lua:/usr/local/openresty/lualib/gateway
    ports:
      - "8080:80"
    environment:
      - GATEWAY_NODE_ID=node-1
      - REDIS_HOST=redis
      - CONSUL_HOST=consul
    depends_on: [redis, consul]

  gateway-2:
    image: openresty/openresty:alpine
    volumes:
      - ./conf/nginx.conf:/usr/local/openresty/nginx/conf/nginx.conf
      - ./lua:/usr/local/openresty/lualib/gateway
    ports:
      - "8081:80"
    environment:
      - GATEWAY_NODE_ID=node-2
      - REDIS_HOST=redis
      - CONSUL_HOST=consul
    depends_on: [redis, consul]

  haproxy:
    image: haproxy:2.4-alpine
    volumes:
      - ./conf/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
    ports:
      - "80:80"
      - "443:443"
    depends_on: [gateway-1, gateway-2]

  redis:
    image: redis:6-alpine
    command: redis-server --appendonly yes
    ports: ["6379:6379"]

  consul:
    image: consul:1.10
    command: agent -server -bootstrap-expect=1 -ui -client=0.0.0.0
    ports: ["8500:8500", "8600:8600/udp"]

  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./conf/prometheus.yml:/etc/prometheus/prometheus.yml
    ports: ["9090:9090"]

  grafana:
    image: grafana/grafana:latest
    ports: ["3000:3000"]
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=admin

5.2 Canary Release Strategy

Routing decisions based on headers, cookies, user ID, or traffic percentage, with APIs to adjust canary rules at runtime.

local function route_canary()
    local uri = ngx.var.uri
    local hdr = ngx.req.get_headers()
    if hdr["X-Canary"] == "true" then return get_canary_upstream() end
    if ngx.var.cookie_canary == "true" then return get_canary_upstream() end
    local user = ngx.ctx.user
    if user and is_canary_user(user.user_id) then return get_canary_upstream() end
    local pct = get_canary_percentage(uri)
    if pct > 0 and math.random(100) <= pct then return get_canary_upstream() end
    return get_stable_upstream()
end

6. Real‑World Results and Lessons Learned

6.1 Performance Test Results

Metric

Value

Test Conditions

QPS

100,000+

8‑core 16 GB single node

P99 Latency

< 10 ms

Excludes backend processing

P95 Latency

< 5 ms

Excludes backend processing

CPU Utilization

40‑60 %

Peak load

Memory Usage

2‑4 GB

Including cache

Active Connections

50,000+

Concurrent connections

6.2 Failure Cases

Case 1: Downstream Service Avalanche – Circuit breaker opened automatically, returning degraded responses and keeping overall availability at 99.9 %.

Case 2: DDoS Attack – Multi‑layer rate limiting and IP blacklist blocked millions of malicious requests with zero impact on legitimate traffic.

Conclusion

Building an API gateway with Nginx and OpenResty is a system‑level effort that touches architecture design, feature implementation, performance tuning, high‑availability deployment, and observability. The presented solution has been battle‑tested in production for years, handling billions of daily requests while providing a solid foundation for secure, scalable microservice communication.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

performance optimization Microservices api-gateway Nginx Lua OpenResty

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.