Designing a System That Scales to 100 Million Users: Key Strategies

This guide explains how to build a highly available, scalable architecture for supporting hundreds of millions of users, covering decoupling, redundancy, vertical and horizontal scaling, load balancing, database replication, sharding, caching, CDN, GeoDNS, and best practices for progressive system expansion.

21CTO
21CTO
21CTO
Designing a System That Scales to 100 Million Users: Key Strategies
Core of high‑availability architecture is decoupling and redundancy. Decoupling includes stateless design and sharding; redundancy includes cache, CDN, master‑slave or master‑master replication, and GeoDNS. Choose appropriate technologies at each product‑iteration stage to meet current needs while preparing for future growth.

Starting from Zero

Deploy a simple application on a single server (e.g., Apache or Tomcat with Oracle or MySQL). This monolithic setup has two major flaws: a database failure or a web‑server failure brings the whole system down, and there is no redundancy or fail‑over.

DNS Resolution

Clients query DNS to obtain the IP address of the system. DNS is usually provided as a paid service and is not run on the application servers.

The Art of Scalability

As data volume, transaction count, and user base grow, the system must scale. Two scaling approaches exist:

Scaling up (vertical): add more CPU, memory, storage, or network capacity to existing servers.

Scaling out (horizontal): add more servers or instances to a pool.

Scaling Up

Increase resources of a single machine (more RAM, faster CPU, SSDs, additional NICs, RAID expansion). Suitable for small systems but limited by hardware, OS, and requires downtime for upgrades.

Scaling Out

Add arbitrary numbers of servers and application instances. Requires load balancing, code changes for parallel processing, and careful data partitioning.

Load Balancer

A load balancer (hardware or software such as HAProxy or Nginx) distributes traffic across multiple backend servers, improving responsiveness and availability. Common algorithms include round‑robin, least connections, fastest response, weighted, and IP hash.

Relational Database Scalability

Techniques include master‑slave replication, master‑master replication, federation, sharding, denormalization, and SQL tuning.

Master‑Slave Replication

Writes go to the master; changes are replicated to slaves. If the master fails, reads can continue from slaves but writes are unavailable until a new master is promoted.

Master‑Master Replication

All nodes act as masters, synchronizing data periodically. This provides higher availability and geographic distribution but is limited by write capacity.

Federation

Split databases by functional domains (e.g., forum, user, product) to reduce load on each database.

Sharding

Divide a large table into smaller pieces based on a key (e.g., user‑id). Each shard handles a subset of rows, improving performance and load distribution.

Denormalization

Duplicate data across tables to reduce expensive joins, improving read performance at the cost of write complexity. Materialized views in PostgreSQL or Oracle can help keep redundant data consistent.

Database Selection

Two major families:

SQL (relational): MySQL, PostgreSQL, Oracle, SQL Server, SQLite, MariaDB.

NoSQL (non‑relational): key‑value (Redis, Dynamo), document (MongoDB, CouchDB), wide‑column (Cassandra, HBase), graph (Neo4j, InfiniteGraph), blob storage (S3, Azure Blob).

Most enterprises use a mix of SQL and NoSQL to satisfy different requirements.

Web‑Layer Horizontal Scaling

Move session state out of the web tier into a shared datastore (SQL or NoSQL) to achieve a stateless architecture, enabling easy horizontal scaling.

Advanced Concepts

Cache

Caching at database, web‑server, or network layers reduces latency and load, improving scalability.

Content Delivery Network (CDN)

CDN nodes cache static assets near users, decreasing page load time and increasing availability.

Globalization (GeoDNS)

GeoDNS resolves domain names to IPs of the nearest data center, routing users to the optimal location.

Conclusion

Applying stateless design, load balancing, caching, multi‑data‑center deployment, CDN, sharding, and other techniques at appropriate product‑iteration stages enables a system to scale beyond 100 million users.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Scalabilityshardinghigh availabilityload balancingcachingCDNDatabase Replication
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.