Fundamentals 18 min read

Designing Scalable Systems for Billions of Users: From a Single Server to Distributed Architecture

This article explains how to evolve a simple single‑server web application into a highly available, horizontally and vertically scalable system for billions of users by covering DNS, vertical and horizontal scaling, load balancing, database replication, sharding, denormalization, SQL/NoSQL choices, stateless design, caching, CDN and global deployment.

IT Xianyu
IT Xianyu
IT Xianyu
Designing Scalable Systems for Billions of Users: From a Single Server to Distributed Architecture

Designing a system that can support billions of users is challenging, but the article breaks the problem into manageable steps.

It starts with the most basic architecture: a web server (Apache/Tomcat) and a single relational database (Oracle/MySQL) on one physical machine, highlighting the single points of failure when either component crashes.

DNS is introduced as the first layer of indirection, allowing clients to resolve a hostname to an IP address before contacting the system.

Scalability: Vertical and Horizontal

Vertical scaling ( scale up ) adds resources such as CPU, memory, storage, or network bandwidth to an existing server, but it is limited by hardware costs, OS constraints, and required downtime.

Horizontal scaling ( scale out ) adds more servers to a pool, enabling independent scaling of the web and data layers, but requires redesign to distribute load and maintain consistency.

Load Balancing

A load balancer (hardware or software like HAProxy or Nginx) distributes incoming traffic across multiple backend servers using algorithms such as round‑robin, least connections, fastest response, weighted, and IP hash.

Round‑robin: sequential request distribution.

Least connections: directs traffic to the server with the fewest active connections.

Fastest response: prefers the server with the lowest latency.

Weighted: gives stronger servers a larger share of traffic.

IP hash: maps a client’s IP to a specific server.

Database Scaling Techniques

To handle growth, relational databases can be extended using:

Replication (master‑slave, master‑master) for redundancy and read scalability.

Sharding (horizontal partitioning) to split data across multiple nodes.

Functional partitioning (vertical partitioning) to separate tables by business domain.

Denormalization to improve read performance at the cost of write complexity.

SQL tuning and materialized views for query optimization.

The article also compares SQL databases (MySQL, PostgreSQL, Oracle, etc.) with NoSQL options (key‑value, document, column, graph, blob) and advises using both where appropriate.

Stateless Architecture

To scale the web layer, session state should be moved out of the web servers into a shared store (relational or NoSQL), enabling a stateless design that works seamlessly with load balancers.

Caching and CDN

Caching at the database, web, or network layer reduces latency and load, while a Content Delivery Network (CDN) stores static assets close to users, improving page load times and availability.

Global Deployment

GeoDNS can route users to the nearest data center, allowing the system to operate worldwide with multiple active regions.

Iterative Expansion

By iteratively applying these techniques—stateless services, load balancing, extensive caching, multi‑region data centers, CDN, and data‑layer sharding—a system can be grown to support over 100 million users and beyond.

distributed systemsscalabilityLoad Balancingsystem designcachingDatabase Replication
IT Xianyu
Written by

IT Xianyu

We share common IT technologies (Java, Web, SQL, etc.) and practical applications of emerging software development techniques. New articles are posted daily. Follow IT Xianyu to stay ahead in tech. The IT Xianyu series is being regularly updated.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.