Why 90% of Ops Teams Choose the Wrong LVS Mode – A Deep Dive into Performance
This article examines the four Linux Virtual Server (LVS) clustering modes—NAT, Direct Routing, Tunneling, and FULLNAT—detailing their architectures, data flows, configuration steps, advantages, disadvantages, and ideal use cases, helping operations engineers select the most suitable load‑balancing solution for high‑performance, scalable web services.
LVS Four Modes Performance PK: Why 90% of Ops Teams Choose the Wrong Mode?
Introduction
Linux Virtual Server (LVS) is a kernel‑level load‑balancing solution that plays a crucial role in large‑scale web services and high‑concurrency scenarios. LVS provides four different cluster modes, each with its own working principle, advantages, and suitable scenarios. This article deeply analyzes these four modes to help operations engineers choose the most appropriate deployment scheme.
1. LVS‑NAT (Network Address Translation) Mode
1.1 Working Principle
LVS‑NAT mode is based on network address translation. The Load Balancer (LB) acts as a gateway, forwarding client requests to real servers (RS).
Data Flow:
Client request reaches LB's VIP (Virtual IP).
LB selects a real server (RS) according to the scheduling algorithm.
LB changes the destination IP of the request packet to the RS's RIP (Real IP), keeping the source IP unchanged.
RS processes the request and sends the response back to LB.
LB changes the source IP of the response packet to VIP and returns it to the client.
1.2 Network Topology
Internet
|
| (VIP)
[Load Balancer]
|
| (DIP - Director IP)
----+----+----+----
| | | |
[RS1][RS2][RS3][RS4]
(RIP)(RIP)(RIP)(RIP)1.3 Configuration Points
LB needs two network interfaces: an external NIC (VIP) and an internal NIC (DIP).
RS only needs an internal IP; its gateway points to LB's DIP.
All RS must be in the same subnet as LB.
LB must enable IP forwarding.
1.4 Advantages
Simple configuration : RS only configures an internal IP and gateway.
OS agnostic : No special requirements for RS.
Port mapping : Supports port translation.
High security : RS are hidden behind the internal network.
1.5 Disadvantages
Performance bottleneck : All traffic passes through LB.
Single point of failure : LB failure affects the whole cluster.
Scalability limits : Constrained by LB processing capacity.
Network requirements : RS must be in the same subnet as LB.
1.6 Applicable Scenarios
Small to medium‑scale web applications.
Internal systems with high security requirements.
Scenarios requiring port mapping.
Clusters with a relatively small number of RS.
2. LVS‑DR (Direct Routing) Mode
2.1 Working Principle
LVS‑DR modifies the MAC address of packets for load balancing. Requests pass through LB, but responses are sent directly from RS to the client.
Data Flow:
Client request reaches LB's VIP.
LB selects an RS and changes the request packet's destination MAC to the RS's MAC.
RS processes the request and sends the response directly to the client (source IP remains VIP).
2.2 Network Topology
Internet
|
| (VIP)
[Load Balancer]
|
----+----+----+----
| | | |
[RS1][RS2][RS3][RS4]
(VIP)(VIP)(VIP)(VIP)2.3 ARP Problem Solution
# On each RS
echo 1 > /proc/sys/net/ipv4/conf/lo/arp_ignore
echo 2 > /proc/sys/net/ipv4/conf/lo/arp_announce
echo 1 > /proc/sys/net/ipv4/conf/all/arp_ignore
echo 2 > /proc/sys/net/ipv4/conf/all/arp_announce2.4 Advantages
High performance : Response traffic bypasses LB.
Strong scalability : Supports a large number of RS.
Stateless : LB only handles requests, no connection state.
Fault isolation : RS failure does not affect other RS.
2.5 Disadvantages
Complex configuration : Requires ARP handling.
Network limitation : All devices must be in the same subnet.
OS requirement : RS must support VIP configuration.
Debug difficulty : Network troubleshooting is more complex.
2.6 Applicable Scenarios
Large‑scale high‑concurrency web applications.
Performance‑critical environments.
Applications with large response data volumes.
Clusters deployed within the same subnet.
3. LVS‑TUN (Tunneling) Mode
3.1 Working Principle
LVS‑TUN uses IP tunneling to encapsulate client requests in a new IP packet, which is then sent to the RS. The RS decapsulates, processes the request, and returns the response directly to the client.
Data Flow:
Client request reaches LB's VIP.
LB selects an RS and encapsulates the request in a new IP packet.
RS receives the encapsulated packet, decapsulates it, processes the request.
RS sends the response directly to the client.
3.2 Network Topology
Internet
|
| (VIP)
[Load Balancer]
|
IP Tunnel
|
----+----+----+----
| | | |
[RS1][RS2][RS3][RS4]
(RIP)(RIP)(RIP)(RIP)3.3 Configuration Points
Both LB and RS need to configure VIP.
RS must support IP tunneling.
RS's VIP is configured on the tunnel interface.
Can be deployed across different subnets.
3.4 Advantages
Cross‑subnet : RS can be placed in different network segments.
High performance : Response traffic bypasses LB.
Geographic distribution : Supports distributed deployment.
Flexibility : More flexible network topology.
3.5 Disadvantages
Tunnel overhead : IP encapsulation adds network overhead.
Complex configuration : Requires tunnel interface setup.
OS requirement : RS must support IP tunneling.
Debug difficulty : Network troubleshooting is harder.
3.6 Applicable Scenarios
Cross‑subnet distributed deployments.
Geographically distributed CDN systems.
Applications needing high network flexibility.
4. LVS‑FULLNAT Mode
4.1 Working Principle
LVS‑FULLNAT, an enhanced version developed by Taobao, supports cross‑subnet deployment without special configuration. The LB modifies both source and destination IP addresses for requests and responses.
Data Flow:
Client request reaches LB's VIP.
LB selects an RS and changes both source and destination IPs of the request.
RS processes the request and returns the response to LB.
LB modifies the source and destination IPs of the response and sends it back to the client.
4.2 Network Topology
Internet
|
| (VIP)
[Load Balancer]
|
Different Subnets
|
----+----+----+----
| | | |
[RS1][RS2][RS3][RS4]
(RIP)(RIP)(RIP)(RIP)4.3 Configuration Points
Based on improved LVS versions from cloud providers (Alibaba Cloud, Tencent Cloud, etc.).
Supports cross‑subnet deployment.
RS requires no special configuration.
Supports real‑IP retrieval.
4.4 Advantages
Cross‑subnet support : RS can be deployed in any subnet.
Simple configuration : RS needs no special setup.
Real‑IP acquisition : Allows obtaining the client’s real IP.
Flexible deployment : Very flexible network topology.
4.5 Disadvantages
Performance overhead : All traffic passes through LB.
Vendor dependency : Requires specific LVS versions.
Scheduling complexity : Needs to handle connection state.
Scalability limits : Constrained by LB processing capacity.
4.6 Applicable Scenarios
Cloud environment deployments.
Complex network environments with cross‑subnet requirements.
Applications that need to obtain the client’s real IP.
Scenarios demanding high network flexibility.
5. Comparison of the Four Modes
5.1 Performance Comparison
NAT : Low performance, medium scalability, same‑subnet requirement, simple configuration.
DR : High performance, high scalability, same‑subnet requirement, complex configuration.
TUN : High performance, high scalability, cross‑subnet capability, complex configuration.
FULLNAT : Medium performance, medium scalability, cross‑subnet capability, moderate configuration complexity.
5.2 Technical Characteristics Comparison
Traffic path : NAT and FULLNAT route both request and response through LB; DR and TUN route only the request through LB, response goes directly from RS.
Network requirements : NAT and DR need same‑subnet; TUN and FULLNAT can work across subnets.
5.3 Selection Recommendations
Choose NAT when:
Small‑scale deployment (<10 RS).
High security is required.
Network environment is simple.
Port mapping is needed.
Choose DR when:
Large‑scale, high‑concurrency applications.
Extreme performance requirements.
Same‑subnet deployment.
Large response data volume.
Choose TUN when:
Cross‑subnet deployment is required.
Geographically distributed systems.
High network flexibility needed.
Tunnel technology is supported.
Choose FULLNAT when:
Cloud environment deployment.
Complex network environments.
Real‑IP acquisition is needed.
Simplicity of configuration is a priority.
6. Deployment Recommendations
6.1 Hardware Requirements
LB server : High‑performance CPU, large memory, multiple NICs.
RS servers : Configured according to application needs.
Network equipment : Must support the required network features.
6.2 Monitoring Points
LB CPU, memory, and network utilization.
RS health status and response time.
Network connection count and concurrency.
Error and timeout rates.
6.3 High Availability Design
LB deployed in active‑standby or cluster mode.
RS implements automatic failover.
Health‑check mechanisms.
Automatic scaling.
6.4 Performance Optimization
Select appropriate scheduling algorithms.
Optimize kernel parameters.
Network tuning.
Application‑level optimizations.
7. Troubleshooting Guide
7.1 Common Issues
NAT mode problems:
Gateway configuration errors.
IP forwarding not enabled.
Firewall rule issues.
DR mode problems:
ARP configuration errors.
VIP configuration problems.
MAC address learning anomalies.
TUN mode problems:
Tunnel configuration errors.
MTU setting issues.
Encapsulation/decapsulation failures.
7.2 Troubleshooting Tools
ipvsadm: View LVS status. tcpdump: Packet capture and analysis. netstat: Inspect network connections. arp: Check ARP tables.
7.3 Log Analysis
System logs: /var/log/messages Kernel logs: dmesg Application logs: Depends on the specific application.
Network logs: Captured via packet analysis.
Summary
The four LVS cluster modes each have distinct characteristics; selecting the appropriate scheme depends on specific requirements. NAT suits small‑scale deployments, DR fits high‑performance scenarios, TUN enables cross‑subnet deployments, and FULLNAT is ideal for cloud environments. Understanding their principles and trade‑offs, and applying proper architecture design, configuration management, and monitoring, equips operations engineers with essential skills to deliver stable, high‑efficiency load‑balancing services for large‑scale web applications.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ops Community
A leading IT operations community where professionals share and grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
