Evolution Stages of a Large‑Scale Web Application Architecture
The article outlines a step‑by‑step evolution of a web application from a single‑machine deployment to a fully distributed architecture, covering server separation, clustering, load‑balancing algorithms, session handling, read‑write splitting, search engine integration, caching, database sharding, service‑oriented decomposition, and message‑queue middleware.
Stage 1 – Single‑Machine Site
Initially all components (Tomcat/Jetty, JSP/Servlet, Maven+Spring+Hibernate or MyBatis, and a DBMS such as MySQL, SQLServer, Oracle) run on one server, forming a small system.
Stage 2 – Separate Application Server and Database
When traffic grows, the application server and database are split onto different machines to improve load capacity and fault tolerance.
Stage 3 – Application Server Cluster
Multiple application servers are added; requests are distributed using tools such as keepalived and ipvsadm. The diagram shows a two‑node cluster.
The system then faces four problems: request forwarding, forwarding algorithm, response routing, and session consistency.
Load‑Balancing Solutions (Problem 1)
1. HTTP redirect – simple but low performance.
2. DNS load balancing – no need for a dedicated balancer but slow to react to failures.
3. Reverse proxy (Apache, Nginx) – easy deployment, possible bottleneck.
4. IP‑layer load balancing – better performance, bandwidth can become a bottleneck.
5. Data‑link‑layer load balancing – avoids returning through the balancer after the server processes the request.
Scheduling Algorithms (Problem 2)
rr (round‑robin), wrr (weighted round‑robin), sh (source‑hash), dh (destination‑hash), lc (least‑connections), wlc (weighted least‑connections), lblc, LBLCR, etc.
Cluster Modes (Problem 3)
NAT, DR, TUN – different ways the balancer forwards traffic.
Session Solutions (Problem 4)
Session sticky (IP‑hash), session replication, centralized session store (DB), cookie‑based session.
Notes: Nginx supports wrr, sh, fair; keepalived+ipvsadm support many algorithms and modes.
Stage 4 – Database Read/Write Splitting
To avoid data inconsistency when scaling databases, a master‑slave replication or middleware (e.g., Mycat) is used for read‑write separation.
Stage 5 – Search Engine for Read‑Heavy Queries
Introduce an inverted‑index search engine to accelerate fuzzy searches (e.g., product title lookup) that are costly with SQL LIKE.
Stage 6 – Caching Layer
Application‑level caches (Guava, Memcached) and database‑level caches (Redis) reduce read pressure; page caches (HTML5 localStorage, cookies) improve response speed.
Stage 7 – Database Vertical & Horizontal Sharding
Vertical sharding separates business domains (transactions, products, users) into different databases; horizontal sharding splits a single table across multiple databases to handle volume.
Stage 8 – Application Decomposition
Split monolithic applications into smaller services (e.g., separate user, product, transaction services) to avoid code bloat.
Adopt Service‑Oriented Architecture (SOA) to share common functionality via dedicated services.
Stage 9 – Introducing Message Middleware
Use language‑agnostic middleware (e.g., Dubbo with Zookeeper) to enable reliable inter‑service communication, load balancing, and monitoring across heterogeneous modules.
Conclusion
The presented evolution is illustrative; real‑world sites must analyze their own traffic patterns, business needs, and constraints to design an appropriate architecture.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.