Understanding Load Balancing: LVS, Nginx, and HAProxy Overview
This article explains the principles of server clustering and load balancing, compares the three most widely used software load balancers—LVS, Nginx, and HAProxy—covering their architectures, NAT/DR modes, advantages, disadvantages, and typical deployment scenarios for modern web services.
Server Clustering and Load Balancing Basics
Most Internet systems today use server clusters, deploying identical services on multiple machines to provide a unified service. A load‑balancing server sits in front of the web‑server cluster, selects the most suitable backend server, and forwards client requests transparently.
Cloud computing and distributed architectures essentially encapsulate backend servers as a single service, making the client perceive an almost unlimited server while the real work is done by the cluster.
The three most popular software load balancers are LVS, Nginx, and HAProxy. Choice depends on site scale: Nginx suffices for PV < 10 million, DNS round‑robin for small farms, LVS for large sites with many servers.
LVS (Linux Virtual Server)
LVS is built into the Linux kernel since 2.4 and provides layer‑4 (transport‑layer) load balancing for TCP/UDP traffic.
LVS Architecture
The LVS cluster consists of three parts:
Load Balancer layer (frontend)
Server Array layer (backend servers)
Shared Storage layer (data sharing)
Load‑Balancing Mechanisms
LVS operates at layer 4, so it cannot parse HTTP URLs like HAProxy does. It forwards packets by modifying IP addresses (NAT mode) or MAC addresses (DR mode).
NAT Mode
In NAT mode, LVS acts as a gateway for real servers (RS). Incoming packets undergo destination‑NAT (DNAT) to the RS IP; responses undergo source‑NAT (SNAT) back to the virtual IP, making the client think it communicated directly with LVS.
DR Mode
In Direct‑Routing (DR) mode, LVS and the real servers share the same virtual IP. LVS only rewrites the destination MAC address, leaving IP headers unchanged. The RS replies directly to the client, eliminating the load‑balancer bottleneck and offering higher performance.
Advantages
Strong load‑handling capability with low CPU/memory consumption.
Simple configuration reduces human error.
Stable with built‑in hot‑standby (e.g., LVS + Keepalived).
Transparent to traffic volume; does not become a bottleneck.
Works for virtually any TCP/UDP service (HTTP, databases, chat, etc.).
Disadvantages
Cannot process regular expressions or perform content‑based routing.
Complex to set up for very large web applications compared to Nginx/HAProxy + Keepalived.
Nginx
Nginx is a high‑performance web server and reverse‑proxy that excels at handling massive concurrent HTTP/HTTPS connections.
Architecture Design
It uses an event‑driven, asynchronous, single‑threaded model. A master process manages one or more worker processes; workers share memory and each handles many connections via epoll, avoiding per‑connection threads.
Nginx Load Balancing
Nginx balances at the application layer (layer 7) for HTTP/HTTPS. It acts as a reverse proxy, receiving client requests and forwarding them to backend servers.
Supported upstream strategies include:
Round‑robin (default)
Weight‑based round‑robin
IP‑hash (session affinity)
Fair (third‑party, based on response time)
URL‑hash (third‑party, directs same URL to same backend)
Advantages
Cross‑platform (Unix‑like OS and Windows)
Simple configuration, similar to programming syntax
Non‑blocking, supports tens of thousands of concurrent connections
Event‑driven (epoll) model
Master/worker process model
Low memory usage (e.g., 10 workers consume ~150 MB under 30k connections)
Built‑in health checks
GZIP compression and bandwidth saving
High stability for reverse‑proxy scenarios
Disadvantages
Only supports HTTP/HTTPS and email protocols
Health checks limited to port probing; no URL‑level checks
Session persistence not native (requires ip_hash or other tricks)
HAProxy
HAProxy provides both layer‑4 (TCP) and layer‑7 (HTTP) proxying, supporting virtual hosting.
Its strengths complement Nginx: it offers session persistence, cookie‑based routing, and URL‑level health checks.
HAProxy generally delivers higher throughput than Nginx for pure load‑balancing tasks and can balance MySQL read traffic via TCP mode.
Common HAProxy scheduling algorithms include round‑robin, weighted round‑robin, source‑IP persistence, request‑URL, and rdp‑cookie.
References
https://zhongwuzw.github.io
http://www.importnew.com/11229.html
http://edisonchou.cnblogs.com
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.