Backend Development 16 min read

Designing Scalable Large Websites: Session Management, Load Balancing, and Database Strategies

The article examines how to build and scale large web sites by combining technical and business considerations, covering static site deployment, multi‑server availability, session synchronization methods, load‑balancing techniques, database read/write separation, caching, and search indexing to handle high concurrency and data growth.

Art of Distributed System Architecture Design

Mar 10, 2015

Designing Scalable Large Websites: Session Management, Load Balancing, and Database Strategies

First, the article asks what qualifies a website as "large" and argues that size is not only about traffic volume but also about the difficulty of either the technical implementation or the business requirements, which drives higher investment in talent and resources.

Even sites with massive traffic and concurrency, such as hao123, can be built with very simple web technology by heavily staticizing the site and deploying multiple servers, allowing it to handle any amount of users.

The author defines a large website as the combination of challenging technology and demanding business; when either side is hard, enterprises must invest more skilled personnel to deliver the product.

Newly launched sites typically have a small user base and can be served by a simple architecture: at least two machines for redundancy and a single database server, as illustrated below.

Although the technology stack is basic and the cost low, three servers and a colocation space are still required, which can be expensive for small teams; cloud platforms now allow cheap deployment without separating application and database, but they also create a single point of failure if the cloud provider goes down.

Deploying an application across multiple servers serves two main purposes: ensuring availability (if one server fails, others keep the site running) and increasing concurrency (more servers can handle more simultaneous requests).

To achieve these goals, session state must be shared across servers. Web containers store sessions in memory and associate them with a session‑id placed in a cookie; each request carries this cookie so the server can retrieve the correct session data.

Synchronizing sessions between servers by copying memory (e.g., Tomcat's built‑in session replication) quickly consumes resources and does not scale linearly; beyond a certain number of servers, overall concurrency can even drop.

A more efficient approach is to store session data on an external cache server, such as a dedicated memcached instance or a distributed cache, which improves stability and concurrency.

Early Taobao stored session information directly in the browser cookie, eliminating server‑side synchronization; while this raises security concerns, it reduces server load and was used for a long time.

Static sites like hao123 avoid session storage altogether, freeing resources for request processing and enabling high concurrency without complex session handling.

In the author's company, incoming requests first pass through an F5 hardware load balancer (or a software solution like LVS) that distributes traffic and can implement "session stickiness" by routing requests with the same session‑id to the same backend server, though this method still has drawbacks if a server fails.

The core issue is session storage, which ties into overall data storage. Cloud platforms simplify deployment but raise concerns about data backup and recovery in case of provider outages.

When a site grows rapidly, the first bottleneck is often storage; adding more servers is cheap, but adopting SOA or distributed architectures incurs high cost and complexity.

Financial websites typically have balanced read/write workloads, whereas consumer sites (e.g., review platforms) are read‑heavy, influencing scaling strategies.

The 12306 ticketing site historically suffered from 503 errors during peak traffic, illustrating the need for designs that prevent total outage by gracefully rejecting excess requests while keeping the rest of the site functional.

In most cases, database overload is the primary cause of failure under high concurrency; separating read and write operations into a master‑slave setup improves capacity, as shown in the diagram below.

Read‑heavy workloads benefit from an external distributed cache to offload frequent queries, while the master database handles writes; this reduces latency and prevents the cache from becoming a new bottleneck.

For massive data sets where users perform fuzzy searches, search engines use inverted indexes stored in files to provide fast retrieval, a technique more suitable than database LIKE queries for handling large‑scale, imprecise queries.

(This article is reposted from the internet; please contact the original author for any copyright concerns.)

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend load balancing Caching Session

Written by

Art of Distributed System Architecture Design

Introductions to large-scale distributed system architectures; insights and knowledge sharing on large-scale internet system architecture; front-end web architecture overviews; practical tips and experiences with PHP, JavaScript, Erlang, C/C++ and other languages in large-scale internet system development.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.