Operations 11 min read

Why Simple Load Balancing Fails and How to Build a Scalable Multi‑Layer Architecture

This article walks through the evolution from a single‑server Tomcat setup to a multi‑layer architecture using Nginx, a gateway, and LVS, explaining dynamic/static request separation, high‑availability strategies, and the performance trade‑offs that guide scalable backend design.

Su San Talks Tech

Jan 5, 2022

Why Simple Load Balancing Fails and How to Build a Scalable Multi‑Layer Architecture

Everyone has heard the classic interview question: "Describe the entire flow from entering a keyword on Taobao to the final web page display, in as much detail as possible." This question touches on HTTP, TCP, gateways, LVS, and many other protocols.

Initially, the founder deployed a single Tomcat server, which worked fine for low traffic. As the business grew, a single machine became a bottleneck and a single point of failure, prompting the addition of multiple servers and a load balancer (LB) – typically Nginx – to distribute client requests.

To avoid exposing the server list to clients and to handle server failures gracefully, the LB decides which backend server to forward each request to. However, routing all traffic directly to the servers raised security concerns, leading to the introduction of a gateway layer for authentication, risk control, protocol conversion, and traffic shaping.

Static resources (JS, CSS) placed heavy load on Tomcat, so the architecture was refined: dynamic requests go through the gateway to Tomcat, while static requests are served directly by Nginx, leveraging its proxy cache for high‑performance static content delivery.

Dynamic‑static separation allows Tomcat to focus on dynamic processing while Nginx handles caching and serving static files.

For internal services with different authentication requirements, the gateway can be bypassed, sending requests directly to the backend servers via Nginx.

To avoid a single point of failure, two Nginx instances are deployed in active‑standby mode using keepalived for health checks.

Because Nginx operates at layer 7, it must establish TCP connections with both client and upstream server, consuming memory and limiting scalability under massive connection loads.

Four‑layer load balancers like LVS work at layer 4, forwarding packets without creating TCP connections, offering higher throughput and lower resource consumption.

How does a four‑layer load balancer work?

When the device receives the first SYN from a client, it selects the best backend server, rewrites the destination IP, and forwards the packet. The TCP three‑way handshake then occurs directly between client and server, with the load balancer acting only as a router.

Combining Nginx (layer 7) with LVS (layer 4) provides both flexible request routing and high‑performance packet forwarding. High availability is achieved by deploying LVS in active‑standby mode, and horizontal scaling is possible by adding more Nginx instances or using DNS load balancing.

When traffic grows beyond a single LVS capacity, multiple LVS instances can be used with DNS‑based load balancing to distribute client requests across them.

Summary

Architecture must be tailored to business needs; a simple Nginx load balancer suffices for modest traffic, while rapid growth may require an LVS + Nginx combo. For massive traffic (tens of Gbps, millions of concurrent connections), even LVS may fall short, prompting custom layer‑4 solutions.

Layered design ensures each component focuses on its responsibilities, making the system extensible and maintainable—just as the TCP/IP model demonstrates.

Hope you found this useful; the next article will dive deeper into the request round‑trip chain, exploring LVS, switches, and routers.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Architecture high availability load balancing nginx LVS

Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.