Operations 25 min read

Advanced Nginx Load Balancing: How to Choose and Tune Layer 4 vs Layer 7

This guide walks through the differences between 4‑layer (TCP) and 7‑layer (HTTP) load balancing in Nginx, explains when to use each, and provides step‑by‑step configuration examples, health‑check setups, performance tuning, SSL handling, WebSocket support, and common pitfalls.

Raymond Ops
Raymond Ops
Raymond Ops
Advanced Nginx Load Balancing: How to Choose and Tune Layer 4 vs Layer 7

Overview

Layer 4 (TCP/UDP) load balancing forwards traffic based only on IP and port, without inspecting the application protocol. Layer 7 (HTTP/HTTPS) can inspect URLs, headers, cookies and perform SSL termination, URL‑based routing, header rewriting, etc.

When to use Layer 4 vs Layer 7

HTTP/HTTPS websites – need URL routing, header rewrite, SSL offload → Layer 7

Raw TCP services (MySQL, Redis, custom protocols) – no protocol awareness needed → Layer 4

WebSocket connections – can be handled by either layer; Layer 7 requires proper Upgrade handling.

Extreme performance requirements – avoid protocol parsing → Layer 4

Request‑level routing (URL, Cookie, Header) → Layer 7

Basic Layer 7 Configuration

# /etc/nginx/conf.d/lb.conf
upstream backend {
    server 192.168.1.10:8080;
    server 192.168.1.11:8080;
    server 192.168.1.12:8080 weight=2;
    server 192.168.1.13:8080 backup;
    least_conn;
    keepalive 64;
    keepalive_requests 1000;
    keepalive_timeout 60s;
}

server {
    listen 443 ssl http2;
    server_name www.example.com;
    ssl_certificate /etc/nginx/ssl/example.com.crt;
    ssl_certificate_key /etc/nginx/ssl/example.com.key;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;

    location / {
        proxy_pass http://backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_connect_timeout 10s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
        proxy_buffering on;
        proxy_buffer_size 4k;
        proxy_buffers 8 32k;
        proxy_busy_buffers_size 64k;
        proxy_next_upstream error timeout http_502 http_503 http_504;
        proxy_next_upstream_tries 3;
        proxy_next_upstream_timeout 10s;
    }
}

Basic Layer 4 Configuration

# /etc/nginx/nginx.conf
stream {
    upstream mysql_backend {
        server 192.168.1.20:3306 weight=5;
        server 192.168.1.21:3306 weight=3;
    }
    server {
        listen 3306;
        proxy_pass mysql_backend;
        proxy_connect_timeout 10s;
        proxy_timeout 3600s;   # long‑running DB sessions
        proxy_socket_keepalive on;
    }
}

Health Checks

The open‑source Nginx only supports passive health checks (a server is marked down after max_fails failures). For active checks you need the third‑party nginx_upstream_check_module compiled with --add-module=../nginx_upstream_check_module.

# Active health‑check example (HTTP)
upstream backend {
    server 192.168.1.10:8080 max_fails=3 fail_timeout=30s;
    check interval=3000 rise=5 fall=2 timeout=1000 type=http;
    check_http_send "GET /health HTTP/1.0

";
    check_http_expect_alive http_2xx http_3xx;
}

location /upstream_status {
    check_status;
    allow 127.0.0.1;
    deny all;
}

SSL/TLS Handling

SSL termination (Layer 7) – Nginx decrypts traffic and forwards to backends over HTTP.

SSL pass‑through (Layer 4) – enable ssl_preread on in a stream block and route based on SNI.

# SSL termination example (see Layer 7 config above)
# SSL pass‑through example
stream {
    map $ssl_preread_server_name $backend {
        www.example.com  https_backend1;
        api.example.com  https_backend2;
        default          https_backend1;
    }
    upstream https_backend1 { server 192.168.1.10:443; }
    upstream https_backend2 { server 192.168.1.20:443; }
    server {
        listen 443;
        proxy_pass $backend;
        ssl_preread on;
    }
}

WebSocket Support

WebSocket requires preserving the Upgrade and Connection headers and using HTTP/1.1.

# WebSocket proxy (Layer 7)
upstream websocket_backend {
    ip_hash;  # optional session affinity
    server 192.168.1.10:8080;
    server 192.168.1.11:8080;
    keepalive 32;
}

server {
    listen 80;
    location /ws/ {
        proxy_pass http://websocket_backend;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;
        proxy_read_timeout 3600s;
        proxy_send_timeout 3600s;
    }
}

Performance Tuning & Best Practices

Worker processes : worker_processes auto; (usually equals CPU cores).

Worker connections : set high value, e.g. worker_connections 65535;.

System TCP parameters (in /etc/sysctl.conf):

net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
net.core.somaxconn = 65535
fs.file-max = 655350

Keep‑alive pool : enable keepalive in the upstream block and set keepalive_requests and keepalive_timeout to avoid per‑request TCP handshakes.

Timeouts : Layer 4 proxy_timeout must be long enough for long‑lived DB connections (e.g. 3600s). Layer 7 proxy_connect_timeout, proxy_send_timeout, proxy_read_timeout should match application needs.

Buffering : tune proxy_buffer_size, proxy_buffers, proxy_busy_buffers_size for large responses.

Common Pitfalls & Troubleshooting

Keep‑alive not working – ensure proxy_http_version 1.1 and proxy_set_header Connection "" are both set.

Layer 4 connections dropping – check proxy_timeout and proxy_socket_keepalive. A typical mistake is leaving the default 10‑minute timeout for long‑running DB sessions.

502 bursts – may be caused by too small max_fails or slow backends. Increase max_fails or optimise the backend.

WebSocket disconnects – verify long proxy_read_timeout and proxy_send_timeout.

Real client IP missing – add proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;.

Useful one‑liners for log analysis (run on the host):

# Requests per backend
awk '{print $8}' /var/log/nginx/upstream.log | sort | uniq -c | sort -rn

# Slow responses (>1s)
awk -F'upstream_response_time: ' '$2>1{print $0}' /var/log/nginx/upstream.log

# Status code distribution
awk '{print $4}' /var/log/nginx/upstream.log | sort | uniq -c | sort -rn

# Recent 502 errors
grep "502" /var/log/nginx/error.log | tail -20

Advanced Features

Backup servers : add backup flag to a server line; it is used only when all primary servers are down.

Slow start (Nginx Plus / OpenResty) : server 10.0.0.1:8080 slow_start=30s; to ramp up traffic for a newly started backend.

URL‑based routing :

location /api/ { proxy_pass http://api_backend; }
location /static/ { proxy_pass http://static_backend; }

Header / Cookie routing using map:

map $http_x_version $backend_name {
    default          backend_v1;
    "v2"            backend_v2;
}
server {
    location / { proxy_pass http://$backend_name; }
}

SNI routing (Layer 4 SSL pass‑through) – see the SSL/TLS handling example above.

Full Example Configurations

Complete Layer 7 Configuration

# /etc/nginx/conf.d/webapp.conf
upstream webapp_backend {
    least_conn;
    server 192.168.1.10:8080 weight=5 max_fails=3 fail_timeout=30s;
    server 192.168.1.11:8080 weight=5 max_fails=3 fail_timeout=30s;
    server 192.168.1.12:8080 weight=3 max_fails=3 fail_timeout=30s;
    server 192.168.1.13:8080 backup;
    keepalive 64;
    keepalive_requests 1000;
    keepalive_timeout 60s;
}

server {
    listen 443 ssl http2;
    server_name www.example.com;
    ssl_certificate /etc/nginx/ssl/example.com.crt;
    ssl_certificate_key /etc/nginx/ssl/example.com.key;
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
    ssl_prefer_server_ciphers on;
    ssl_session_cache shared:SSL:10m;
    ssl_session_timeout 10m;

    access_log /var/log/nginx/webapp_access.log main;
    error_log /var/log/nginx/webapp_error.log;

    location / {
        proxy_pass http://webapp_backend;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_connect_timeout 10s;
        proxy_send_timeout 60s;
        proxy_read_timeout 60s;
        proxy_buffering on;
        proxy_buffer_size 4k;
        proxy_buffers 8 32k;
        proxy_busy_buffers_size 64k;
        proxy_next_upstream error timeout http_502 http_503 http_504;
        proxy_next_upstream_tries 3;
        proxy_next_upstream_timeout 10s;
    }

    location /static/ {
        alias /var/www/static/;
        expires 7d;
        add_header Cache-Control "public, no-transform";
    }

    location /health {
        access_log off;
        return 200 "OK
";
        add_header Content-Type text/plain;
    }
}

# HTTP → HTTPS redirect
server {
    listen 80;
    server_name www.example.com;
    return 301 https://$host$request_uri;
}

Complete Layer 4 Configuration (MySQL & Redis)

# /etc/nginx/stream.d/mysql.conf
upstream mysql_master { server 192.168.1.20:3306; }
upstream mysql_slave {
    least_conn;
    server 192.168.1.21:3306 weight=5;
    server 192.168.1.22:3306 weight=3;
}

# Write traffic (port 3307) → master
server {
    listen 3307;
    proxy_pass mysql_master;
    proxy_connect_timeout 10s;
    proxy_timeout 7200s;   # long transactions
    proxy_socket_keepalive on;
}

# Read traffic (port 3308) → slaves
server {
    listen 3308;
    proxy_pass mysql_slave;
    proxy_connect_timeout 10s;
    proxy_timeout 3600s;
    proxy_socket_keepalive on;
}

# Redis cluster (port 6379)
upstream redis_cluster {
    server 192.168.1.30:6379;
    server 192.168.1.31:6379;
    server 192.168.1.32:6379;
}
server {
    listen 6379;
    proxy_pass redis_cluster;
    proxy_connect_timeout 5s;
    proxy_timeout 600s;
    proxy_socket_keepalive on;
}

WebSocket Load Balancing

# /etc/nginx/conf.d/ws.conf
upstream websocket_backend {
    ip_hash;  # optional session affinity
    server 192.168.1.10:8080;
    server 192.168.1.11:8080;
    keepalive 32;
}

server {
    listen 80;
    location /ws/ {
        proxy_pass http://websocket_backend;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;
        proxy_read_timeout 3600s;
        proxy_send_timeout 3600s;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Gray‑Release (Canary) Example

# Upstreams
upstream backend_stable {
    server 192.168.1.10:8080;
    server 192.168.1.11:8080;
}
upstream backend_canary {
    server 192.168.1.20:8080;
}

# 5% traffic to canary, rest to stable
split_clients "$remote_addr$uri" $backend_version {
    5%   backend_canary;
    *    backend_stable;
}

server {
    listen 80;
    location / {
        proxy_pass http://$backend_version;
    }
}

Best Practices & Important Notes

Reload safely : always run nginx -t before nginx -s reload.

Unique upstream names : avoid duplicate names across included files.

Trailing slash in proxy_pass matters – without slash the original URI is kept; with slash the location part is replaced.

Keep‑alive configuration : both proxy_http_version 1.1 and proxy_set_header Connection "" are required for connection reuse.

Timeout tuning for Layer 4 : proxy_timeout must be long enough for the expected session length (e.g., 3600s for MySQL).

Backup servers : define a server with the backup flag; it is used only when all primary servers are marked down.

Slow start (commercial Nginx Plus or OpenResty) can protect newly started backends from a traffic surge.

Monitoring & Alerts

Define a detailed log format to capture upstream metrics:

log_format upstream_log '\$remote_addr - [\$time_local] "\$request" \$status \$body_bytes_sent "upstream: \$upstream_addr" "upstream_status: \$upstream_status" "upstream_response_time: \$upstream_response_time" "request_time: \$request_time"';
access_log /var/log/nginx/upstream.log upstream_log;

Prometheus exporter (e.g., nginx-prometheus-exporter) can scrape metrics such as nginx_connections_active, nginx_upstream_server_state, and request latency histograms. Example alert rules:

# High active connections
alert: NginxHighConnections
expr: nginx_connections_active > 10000
for: 5m
labels:
  severity: warning
annotations:
  summary: "Nginx active connections exceed threshold"

# Backend down
alert: NginxUpstreamDown
expr: nginx_upstream_server_state != 1
for: 1m
labels:
  severity: critical
annotations:
  summary: "One or more upstream servers are down"

# High 95th‑percentile latency (>2s)
alert: NginxHighLatency
expr: histogram_quantile(0.95, rate(nginx_http_request_duration_seconds_bucket[5m])) > 2
for: 5m
labels:
  severity: warning
annotations:
  summary: "Request latency exceeds 2 seconds"

Summary

Use Layer 7 when you need HTTP‑level features (URL routing, header rewrite, SSL termination).

Use Layer 4 for raw TCP services, maximum throughput, or when protocol awareness is unnecessary.

Select the appropriate load‑balancing algorithm: round‑robin for stateless services, weighted round‑robin for heterogeneous capacity, ip_hash for session affinity, least_conn for long‑lived connections, and consistent hash for cache‑friendly workloads.

Enable keep‑alive pools and tune timeouts to avoid unexpected connection drops.

Implement health checks – passive by default, active via nginx_upstream_check_module for faster failure detection.

Follow the performance‑tuning checklist (worker processes, sysctl, Nginx buffers, keep‑alive settings).

Monitor with detailed logs, Prometheus metrics, and alert rules to catch high connection counts, backend failures, and latency spikes early.

Load BalancingConfigurationPerformance TuningNginxHealth CheckLayer 4Layer 7
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.