Operations 10 min read

Choosing the Right Nginx Load‑Balancing Strategy: Real‑World Comparison and Best Practices

A seasoned ops engineer recounts a production incident caused by improper Nginx load‑balancing, then compares weighted round‑robin and IP‑hash strategies with detailed configurations, performance test results, common pitfalls, dynamic weight scripts, and practical recommendations for reliable, high‑performance deployments.

Raymond Ops
Raymond Ops
Raymond Ops
Choosing the Right Nginx Load‑Balancing Strategy: Real‑World Comparison and Best Practices

Background

During a major sales event an e‑commerce platform suffered cart data loss and unstable login sessions. Investigation revealed that an inappropriate Nginx load‑balancing configuration was the root cause, highlighting the critical impact of strategy selection.

Load‑Balancing Strategies

1. Weighted Round‑Robin

Working principle : Distributes requests proportionally based on server weight.

upstream backend {
    server 192.168.1.10:8080 weight=3;
    server 192.168.1.11:8080 weight=2;
    server 192.168.1.12:8080 weight=1;
}
server {
    listen 80;
    server_name example.com;
    location / {
        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Applicable scenarios :

Servers with clearly different performance characteristics

Stateless applications such as APIs

When fine‑grained traffic control is needed

2. IP Hash

Working principle : Uses a hash of the client IP to consistently route a client to the same backend, providing session affinity.

upstream backend {
    ip_hash;
    server 192.168.1.10:8080;
    server 192.168.1.11:8080;
    server 192.168.1.12:8080;
}
server {
    listen 80;
    server_name example.com;
    location / {
        proxy_pass http://backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    }
}

Applicable scenarios :

Stateful applications that require session persistence

Services heavily dependent on local caching

Performance Test

Three identical servers were set up in a test environment and subjected to a week‑long load test.

# Server configuration
CPU: 4 cores
Memory: 8GB
Network: 1Gbps

# Test tool
wrk -t12 -c400 -d30s --latency http://test.domain.com/api/test

Results :

Average response time – Weighted RR: 156 ms, IP Hash: 189 ms

Throughput – Weighted RR: 8,432 RPS, IP Hash: 7,156 RPS

99 % latency – Weighted RR: 445 ms, IP Hash: 567 ms

Server load‑balance rating – Weighted RR: ★★★★★, IP Hash: ★★★

Session consistency – Weighted RR: ❌, IP Hash: ✅

Production Best Practices

Hybrid Strategy (recommended)

# Static resources use weighted round robin
upstream static_backend {
    server 192.168.1.10:8080 weight=3;
    server 192.168.1.11:8080 weight=2;
}

# User‑related API use IP hash
upstream user_backend {
    ip_hash;
    server 192.168.1.20:8080;
    server 192.168.1.21:8080;
}

server {
    listen 80;
    server_name example.com;

    location ~* \.(css|js|png|jpg|jpeg|gif|ico)$ {
        proxy_pass http://static_backend;
        expires 1y;
        add_header Cache-Control "public, immutable";
    }

    location /api/user/ {
        proxy_pass http://user_backend;
        proxy_set_header Host $host;
    }

    location /api/ {
        proxy_pass http://static_backend;
        proxy_set_header Host $host;
    }
}

Dynamic Weight Adjustment

# monitor.sh – adjust weights based on CPU load
#!/bin/bash
while true; do
    for server in server1 server2 server3; do
        cpu=$(ssh $server "top -bn1 | grep 'Cpu(s)' | awk '{print $2}'")
        if [ $cpu -lt 30 ]; then
            weight=3
        elif [ $cpu -lt 70 ]; then
            weight=2
        else
            weight=1
        fi
        # Update Nginx upstream weight here (implementation dependent)
    done
    nginx -s reload
    sleep 30
done

Common Pitfalls

Pitfall 1: Blindly using IP hash

Placing IP hash behind a CDN makes many requests appear from the same IP, causing severe imbalance.

upstream backend {
    ip_hash;
    server web1:8080;
    server web2:8080;
}

Fix : Hash the real client IP.

upstream backend {
    hash $http_x_forwarded_for consistent; # use true client IP
    server web1:8080;
    server web2:8080;
}

Pitfall 2: Incorrect weight settings

New servers with three times the capacity but only double the weight stay idle while old servers overload.

upstream backend {
    server old_server:8080 weight=1;
    server new_server:8080 weight=4; # reflect actual performance difference
}

Optimization Tips

1. Enable keep‑alive connections

upstream backend {
    server 192.168.1.10:8080 weight=3;
    keepalive 32; # maintain 32 persistent connections
}
server {
    location / {
        proxy_pass http://backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
    }
}

2. Health check configuration

upstream backend {
    server 192.168.1.10:8080 weight=3 max_fails=2 fail_timeout=10s;
    server 192.168.1.11:8080 weight=2 max_fails=2 fail_timeout=10s;
}

3. Monitoring script

# nginx_status.sh – monitor upstream health
#!/bin/bash
echo "=== Nginx Upstream Status ==="
curl -s http://localhost/nginx_status | grep -A 20 "upstream"

echo "=== Backend Health Check ==="
for server in 192.168.1.10 192.168.1.11; do
    resp=$(curl -o /dev/null -s -w "%{http_code}" http://$server:8080/health)
    if [ $resp -eq 200 ]; then
        echo "✅ $server - OK"
    else
        echo "❌ $server - Failed ($resp)"
    fi
done

Conclusion

Prefer weighted round‑robin for stateless services – it delivers better performance and scalability.

Use IP hash cautiously for stateful services; consider external session stores such as Redis.

Hybrid strategies let you match the optimal method to each workload.

Continuous monitoring and periodic analysis of logs and metrics are essential for stable operations.

MonitoringoperationsLoad Balancingperformance testingNginxip hashweighted round robin
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.