Zero‑Downtime HAProxy Load Balancing: Complete L4/L7 Deployment Guide
This guide walks through installing HAProxy 2.x, configuring L4 TCP and L7 HTTP/HTTPS load balancing for web, MySQL, and Redis, setting up health checks, session persistence, monitoring, high‑availability with Keepalived, performance tuning, security hardening, and step‑by‑step zero‑downtime deployment and rollback procedures.
Applicable Scenarios & Prerequisites
Applicable services : Web cluster entry, micro‑service gateway, MySQL/Redis read load balancing, SSL termination proxy.
Prerequisites :
HAProxy ≥ 2.0 (recommended 2.4+ with HTTP/2 and dynamic backend updates)
OS: RHEL 7/8, Ubuntu 18.04/20.04/22.04
At least two NICs (preferred front‑end/back‑end separation) or a single NIC with multiple IPs
Root or sudo privileges, ability to bind ports 80/443
Minimum two healthy backend instances
Environment & Version Matrix
Key components and their recommended versions:
HAProxy 2.0+ (prefer 2.4‑2.8) – supports HTTP/2, dynamic backends, Runtime API
OpenSSL 1.1.1+ (TLS 1.3) – ALPN, SNI, OCSP stapling
Keepalived 2.0+ – VIP failover for high availability
System resources: 2 CPU / 4 GB RAM / 20 GB disk (minimum) – can handle ~10 000 concurrent connections
Quick Checklist
Install HAProxy and verify version
Configure L4 TCP load balancing (MySQL/Redis)
Configure L7 HTTP/HTTPS load balancing (Web services)
Configure backend health checks (TCP/HTTP/SSL)
Choose load‑balancing algorithm (round‑robin, least‑conn, consistent hash, etc.)
Configure session persistence (Cookie or source‑IP)
Set up SSL/TLS termination and SNI
Enable statistics page and monitoring
Test failover and service drain
Configure high‑availability with Keepalived VIP
Implementation Steps
Step 1 – Install HAProxy and Verify Version
RHEL/CentOS:
# RHEL 8
sudo dnf install -y haproxy
# RHEL 7 (EPEL required)
sudo yum install -y epel-release
sudo yum install -y haproxyUbuntu/Debian:
sudo apt update
sudo apt install -y haproxyInstall latest version from PPA (Ubuntu) or compile from source if needed, then verify: haproxy -v Expected output: HAProxy version 2.8.3 2023/11/23 Check compile options:
haproxy -vv | grep -E "OpenSSL|PCRE|epoll"Step 2 – Configure L4 TCP Load Balancing (MySQL read‑only)
Create /etc/haproxy/haproxy.cfg with a global section, defaults, and a listen mysql-read block. Key parameters:
global
log /dev/log local0 info
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
stats timeout 30s
user haproxy
group haproxy
daemon
maxconn 40000
nbproc 1
nbthread 4
cpu-map auto:1/1-4 0-3
ssl-default-bind-ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384
ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11
tune.ssl.default-dh-param 2048
defaults
log global
mode tcp
option tcplog
option dontlognull
timeout connect 5s
timeout client 50s
timeout server 50s
timeout check 5s
retries 3
maxconn 30000
listen mysql-read
bind 0.0.0.0:3307
mode tcp
balance leastconn
option tcp-check
tcp-check connect port 3306
tcp-check send-binary 0a
tcp-check expect binary 0a
server mysql-slave-01 10.0.1.101:3306 check inter 3s rise 2 fall 3 maxconn 1000
server mysql-slave-02 10.0.1.102:3306 check inter 3s rise 2 fall 3 maxconn 1000
server mysql-slave-03 10.0.1.103:3306 check inter 3s rise 2 fall 3 maxconn 1000 backupKey parameter explanations: balance leastconn – selects the backend with the fewest active connections (ideal for long‑lived MySQL connections) check inter 3s – health‑check interval rise 2 fall 3 – number of consecutive successes/failures to mark a server up/down maxconn 1000 – per‑backend connection limit backup – standby server used only when all primary backends are down
Validate configuration syntax: haproxy -c -f /etc/haproxy/haproxy.cfg Expected output: Configuration file is valid Start HAProxy and test the TCP load balancer:
systemctl enable haproxy
systemctl start haproxy
systemctl status haproxy # Test MySQL connections
mysql -h 127.0.0.1 -P 3307 -u test -ppassword -e "SELECT @@hostname;"
# Loop to observe round‑robin effect
for i in {1..10}; do mysql -h 127.0.0.1 -P 3307 -u test -ppassword -e "SELECT @@hostname;" 2>/dev/null | tail -1; doneStep 3 – Configure L7 HTTP/HTTPS Load Balancing (Web services)
Add a frontend for HTTP on port 80 and another for HTTPS on port 443 with SSL termination. Use ACLs to route by host name and path.
# HTTP frontend
frontend http-in
bind *:80
mode http
option httplog
option forwardfor
option http-server-close
acl is_api hdr(host) -i api.example.com
acl is_web hdr(host) -i www.example.com
acl is_admin hdr(host) -i admin.example.com
acl is_static path_beg /static /images /css /js
use_backend api-backend if is_api
use_backend web-backend if is_web
use_backend admin-backend if is_admin
use_backend static-backend if is_static
default_backend web-backend
# HTTPS frontend (SSL termination)
frontend https-in
bind *:443 ssl crt /etc/haproxy/certs/example.com.pem alpn h2,http/1.1
mode http
option httplog
option forwardfor
http-request set-header X-Forwarded-Proto https
http-request set-header X-Forwarded-Port 443
acl is_api hdr(host) -i api.example.com
use_backend api-backend if is_api
default_backend web-backend
# Web backend – round‑robin + cookie persistence
backend web-backend
mode http
balance roundrobin
cookie SERVERID insert indirect nocache httponly secure
option httpchk GET /health
http-check expect status 200
server web-01 10.0.2.11:8080 check cookie web01 maxconn 500
server web-02 10.0.2.12:8080 check cookie web02 maxconn 500
server web-03 10.0.2.13:8080 check cookie web03 maxconn 500
# API backend – least‑conn + source‑IP stick table
backend api-backend
mode http
balance leastconn
stick-table type ip size 100k expire 30m
stick on src
option httpchk POST /api/health
http-check expect status 200
server api-01 10.0.2.21:8080 check maxconn 1000
server api-02 10.0.2.22:8080 check maxconn 1000
server api-03 10.0.2.23:8080 check maxconn 1000
# Static backend – URI hash (consistent hashing)
backend static-backend
mode http
balance uri
hash-type consistent
option httpchk HEAD /favicon.ico
http-check expect status 200
server static-01 10.0.2.31:8080 check
server static-02 10.0.2.32:8080 check
# Admin backend – single‑node with backup
backend admin-backend
mode http
balance roundrobin
option httpchk GET /admin/health
http-check expect status 200
server admin-01 10.0.2.41:8080 check
server admin-02 10.0.2.42:8080 check backupKey parameter notes: option forwardfor adds the original client IP in
X-Forwarded-For cookie SERVERID insertenables cookie‑based session persistence stick-table + stick on src provides source‑IP persistence for stateless APIs balance uri with hash-type consistent gives consistent hashing for CDN‑like workloads
Step 4 – Advanced Health Checks
Custom HTTP health check with headers and response validation:
backend advanced-health-check
mode http
balance roundrobin
option httpchk GET /health HTTP/1.1
Host:\ health.example.com
User-Agent:\ HAProxy-Health-Check
http-check expect status 200
http-check expect string "healthy"
server app-01 10.0.3.11:8080 check port 8080 inter 5s rise 2 fall 3
server app-02 10.0.3.12:8080 check port 8080 inter 5s rise 2 fall 3SSL health check using HAProxy’s built‑in SSL‑hello test:
backend ssl-backend
mode tcp
balance roundrobin
option ssl-hello-chk
server secure-01 10.0.3.21:443 check check-ssl verify none
server secure-02 10.0.3.22:443 check check-ssl verify noneStep 5 – Load‑Balancing Algorithm Comparison
balance roundrobin– simple even distribution, no load awareness. balance leastconn – prefers the server with the fewest active connections; good for long‑lived connections. balance source – source‑IP hash; provides sticky sessions but can lead to uneven load. balance uri – hashes the request URI; useful for CDN caching. balance uri + hash-type consistent – consistent hashing; minimal remapping when backends change. balance leasttime response (HAProxy 2.4+) – routes to the backend with the lowest recent response time.
Step 6 – Session Persistence Options
Method 1 – Cookie insertion (recommended)
backend cookie-persistence
mode http
balance roundrobin
cookie SERVERID insert indirect nocache httponly secure
server web-01 10.0.5.11:8080 check cookie srv01
server web-02 10.0.5.12:8080 check cookie srv02Key flags: insert – HAProxy creates the cookie. indirect – backend does not see the cookie. httponly secure – mitigates XSS and MITM attacks.
Method 2 – Source‑IP stickiness
backend source-ip-persistence
mode http
balance roundrobin
stick-table type ip size 100k expire 30m
stick on src
server web-01 10.0.5.11:8080 check
server web-02 10.0.5.12:8080 checkMethod 3 – URL‑parameter stickiness (API token scenario)
backend url-param-persistence
mode http
balance roundrobin
stick-table type string len 32 size 100k expire 1h
stick on url_param(session_id)
server api-01 10.0.5.21:8080 check
server api-02 10.0.5.22:8080 checkStep 7 – Statistics Page & Monitoring
Enable HAProxy’s built‑in stats page:
listen stats
bind *:8404
mode http
stats enable
stats uri /haproxy-stats
stats refresh 30s
stats realm "HAProxy Statistics"
stats auth admin:StrongPassword123
stats admin if TRUEAccess via http://<em>haproxy‑ip</em>:8404/haproxy-stats or curl with basic auth.
Runtime API examples for server state manipulation:
# Show all server states
echo "show servers state" | socat stdio /run/haproxy/admin.sock
# Set a server to maintenance mode (drain)
echo "set server web-backend/web-01 state maint" | socat stdio /run/haproxy/admin.sockIntegrate with Prometheus using haproxy_exporter:
# Download and install exporter
wget https://github.com/prometheus/haproxy_exporter/releases/download/v0.15.0/haproxy_exporter-0.15.0.linux-amd64.tar.gz
tar xzf haproxy_exporter-0.15.0.linux-amd64.tar.gz
sudo cp haproxy_exporter-0.15.0.linux-amd64/haproxy_exporter /usr/local/bin/
# Systemd service (simplified)
[Unit]
Description=HAProxy Exporter
After=network.target
[Service]
Type=simple
User=haproxy
ExecStart=/usr/local/bin/haproxy_exporter --haproxy.scrape-uri="unix:/run/haproxy/admin.sock"
Restart=on-failure
[Install]
WantedBy=multi-user.target
# Enable and start
systemctl daemon-reload
systemctl enable haproxy_exporter
systemctl start haproxy_exporterStep 8 – Failover & Service Drain Testing
Simulate backend failure by stopping the service or blocking the port, then observe HAProxy logs:
# Stop Nginx on a web node
ssh web-01 "systemctl stop nginx"
# Or drop traffic with iptables
ssh web-01 "iptables -A INPUT -p tcp --dport 8080 -j DROP"
# Tail HAProxy logs for the failure
journalctl -u haproxy -f | grep web-01Expected log entry:
Server web-backend/web-01 is DOWN, reason: Layer4 connection problemTest graceful drain (zero‑downtime maintenance):
# Set server to DRAIN (no new connections)
echo "set server web-backend/web-01 state drain" | socat stdio /run/haproxy/admin.sock
# Wait for existing connections to finish
watch -n 1 'echo "show stat" | socat stdio /run/haproxy/admin.sock | grep web-01'
# Finally set to maintenance
echo "set server web-backend/web-01 state maint" | socat stdio /run/haproxy/admin.sockStep 9 – High Availability with Keepalived VIP
Install Keepalived on both HAProxy nodes:
# RHEL/CentOS
sudo yum install -y keepalived
# Ubuntu/Debian
sudo apt install -y keepalivedMaster /etc/keepalived/keepalived.conf (priority 100):
global_defs {
router_id HAProxy-Master
}
vrrp_script chk_haproxy {
script "/usr/bin/killall -0 haproxy"
interval 2
weight -20
}
vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass SecurePassword123
}
virtual_ipaddress {
10.0.0.100/24
}
track_script { chk_haproxy }
notify_master "/usr/local/bin/haproxy_master.sh"
}Backup node configuration is identical except state BACKUP and priority 90.
Start Keepalived on both nodes and verify the VIP moves when the master fails.
Monitoring & Alerting
Key HAProxy metrics (accessible via the Runtime API or Prometheus): haproxy_backend_up – backend availability (alert when 0) haproxy_backend_response_time_average_seconds – average response time (alert > 1 s)
5xx error rate –
rate(haproxy_backend_http_responses_total{code="5xx"}[5m]) / rate(haproxy_backend_http_responses_total[5m])(alert > 5 %)
Queue depth – haproxy_backend_current_queue (alert > 10)
Suggested Grafana panels:
Backend availability – query haproxy_backend_up, threshold < 1
Current sessions – haproxy_frontend_current_sessions, threshold > 25 000
P95 backend latency – histogram_quantile(0.95, haproxy_backend_response_time), threshold > 500 ms
HTTP error rate –
rate(haproxy_frontend_http_responses_total{code=~"4..|5.."}[5m]), threshold > 5 %
Performance & Capacity
Benchmark commands (using wrk) for a 4 CPU / 8 GB test box:
# TCP test (MySQL read)
wrk -t 8 -c 1000 -d 60s http://10.0.0.100:3307
# HTTP test (Web)
wrk -t 8 -c 2000 -d 60s --latency http://10.0.0.100/
# HTTPS test
wrk -t 8 -c 1000 -d 60s --latency https://10.0.0.100/Typical results on a 2 C / 4 G server:
TCP: 50 000 QPS, P99 < 5 ms, CPU ≈ 40 %
HTTP: 30 000 QPS, P99 < 10 ms, CPU ≈ 60 %
HTTPS: 15 000 QPS, P99 < 20 ms, CPU ≈ 80 %
Recommended tuning parameters (add to the global section):
global
maxconn 100000
nbthread 8
tune.bufsize 32768
tune.maxrewrite 8192
tune.ssl.cachesize 100000Security & Compliance
SSL/TLS hardening – disable weak ciphers and enable OCSP stapling:
global
ssl-default-bind-ciphers ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-CHACHA20-POLY1305
ssl-default-bind-options no-sslv3 no-tlsv10 no-tlsv11 no-tls-tickets
ssl-default-bind-ciphersuites TLS_AES_128_GCM_SHA256:TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256
tune.ssl.ocsp-update.mode onDDoS protection – rate‑limit connections and requests per IP:
frontend http-in
stick-table type ip size 1m expire 30s store conn_rate(10s)
tcp-request connection track-sc0 src
tcp-request connection reject if { sc_conn_rate(0) gt 100 }
http-request track-sc1 src
http-request deny if { sc_http_req_rate(1) gt 200 }Audit logging – enable detailed HTTP logs and forward them to syslog:
global
log /dev/log local0 info
log /dev/log local1 notice
frontend https-in
option httplog
log-format "%ci:%cp [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r"Configure syslog-ng to store HAProxy logs in /var/log/haproxy/haproxy.log.
Common Issues & Troubleshooting
Backend marked DOWN – check health‑check configuration and network connectivity; adjust check inter, rise, fall as needed.
503 Service Unavailable – all backends are down; bring at least one backend up or add a backup server.
SSL handshake failure – verify certificate validity and HAProxy crt path.
Connection timeouts – increase timeout client/server or maxconn, or scale backend capacity.
VIP unreachable – ensure Keepalived is running and VRRP traffic is not blocked by firewalls.
Session persistence not working – confirm cookie directive is present and not overridden by the application.
Change & Rollback Playbook
Maintenance window : 02:00‑04:00 AM.
Pre‑change checklist:
Backup /etc/haproxy/haproxy.cfg Validate new configuration in a staging environment
Prepare rollback command
Notify stakeholders
Gray‑release steps :
Validate new config: haproxy -c -f /etc/haproxy/haproxy.cfg.new Hot‑reload HAProxy: systemctl reload haproxy (zero downtime)
Monitor for 5 minutes:
watch -n 1 'echo "show stat" | socat stdio /run/haproxy/admin.sock | grep DOWN'Rollback trigger conditions (any of the following): all backends down, 5xx error rate > 10 %, client connection failure > 5 %.
Rollback command :
cp /etc/haproxy/haproxy.cfg.backup /etc/haproxy/haproxy.cfg
systemctl reload haproxyBest Practices
Health checks must probe real application endpoints (e.g., /health) rather than static files.
Deploy at least two HAProxy instances with a Keepalived VIP to avoid a single point of failure.
Prefer cookie‑based session persistence; source‑IP stickiness can break behind NAT.
Set conservative timeout values that exceed the slowest backend response.
Enable full HTTP logging ( option httplog) for troubleshooting.
Limit per‑backend connections with maxconn (recommend 80 % of backend capacity).
Automate SSL certificate renewal with Let’s Encrypt and certbot.
Alert on P95/P99 latency to catch performance regressions early.
Regularly practice failover drills by manually draining backends.
Version‑control HAProxy configuration in Git and require PR review for changes.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
