Operations 8 min read

How to Maximize HAProxy Performance with CPU, NIC, and System Tuning

This guide explains how to select optimal hardware, configure CPU affinity, adjust kernel parameters for short and long connections, enable SSL offload, and use HAProxy multi‑process mode to achieve the highest possible throughput and stability.

ITPUB
ITPUB
ITPUB
How to Maximize HAProxy Performance with CPU, NIC, and System Tuning

Hardware and System Selection

HAProxy runs single‑threaded, non‑blocking, and event‑driven, so it fully utilizes one CPU core. When choosing hardware, prioritize high‑frequency CPUs with large caches over simply increasing core count.

Use NICs that support multiple queues and enable CPU‑affinity for interrupts (e.g., Intel I350AM4 with 8 RX and 8 TX queues per port). Bind HAProxy and NIC interrupts to the same physical CPU but different cores to share L3 cache while avoiding contention.

Kernel Parameter Tuning for Short Connections

Adjust the following sysctl settings to handle high connection rates: net.ipv4.ip_local_port_range = 1025 65534 (increase local port range)

net.ipv4.tcp_max_syn_backlog = 100000
net.core.netdev_max_backlog = 100000
net.ipv4.tcp_tw_reuse = 1

and net.ipv4.tcp_tw_recycle = 1 (allow reuse of TIME_WAIT sockets)

net.core.somaxconn = 65534
fs.file-max = 65535

(increase file descriptor limit)

Disable IRQ Balance and avoid running HAProxy in a virtual machine when connection rates exceed 5 K/s. Also, avoid using iptables conntrack as it degrades performance.

Kernel Parameter Tuning for Long Connections

For persistent connections (e.g., SSL offload), apply these settings:

net.ipv4.tcp_rmem = 10000000 10000000 10000000
net.ipv4.tcp_wmem = 10000000 10000000 10000000
net.ipv4.tcp_mem = 10000000 10000000 10000000
net.core.rmem_max = 11960320

and

net.core.wmem_max = 11960320
net.ipv4.tcp_sack = 0

and net.ipv4.tcp_timestamps = 0 (disable selective ACK and timestamps) net.ipv4.tcp_slow_start_after_idle = 0 (prevent CWnd reduction on idle connections)

HAProxy Multi‑Process Configuration

HAProxy can run multiple processes, though the official recommendation is to use a single process. Multi‑process benefits include dedicated cores per process, linear SSL key generation scaling, and easier horizontal scaling. Drawbacks are increased memory usage, inability to share stick‑tables, and more complex configuration.

Example configuration to run four processes and bind each to a specific CPU core:

global
    nbproc 4
    cpu-map 1 0   # process 1 → CPU 0
    cpu-map 2 1
    cpu-map 3 2
    cpu-map 4 3

Bind frontends to specific processes:

frontend access_http
    bind 0.0.0.0:80
    bind-process 1

frontend access_https
    bind 0.0.0.0:443 ssl crt /etc/yourdomain.pem
    bind-process 2 3 4

NIC Driver Settings

For Intel 10‑GbE NICs (e.g., 82599EB), disable Large Receive Offload (LRO) to reduce latency:

# ethtool -K eth0 lro off
# ethtool -K eth1 lro off

Optionally adjust PCIe settings with setpci as needed.

Additional Process Isolation

Use taskset to bind auxiliary services (e.g., backup clients, Munin, Nagios, SNMP, syslog, Zabbix) to cores separate from HAProxy to avoid CPU contention.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

LinuxTuningHAProxycpu-affinity
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.