Designing a System That Can Scale to Hundreds of Millions of Users
This article explains how to architect a highly scalable backend by starting with a simple single‑server setup, then applying vertical and horizontal scaling, database replication, sharding, caching, load balancing, CDN, GeoDNS and stateless design to support over a hundred million concurrent users.
Designing a system that can support tens of billions of users is a daunting challenge for software architects, but the following steps break the problem into manageable pieces.
1. Start from a Single Server
Initially deploy a web application (e.g., on Apache or Tomcat) and a relational database (Oracle or MySQL) on the same physical machine. This simple setup quickly reveals two critical flaws: if the database fails, the whole system goes down; if the web server fails, the entire service becomes unavailable.
2. The Art of Scalability
Scalability means adding resources without degrading user experience. Two main approaches exist:
scale up (vertical scaling) – add more CPU, memory, storage, or faster network interfaces to an existing server.
scale out (horizontal scaling) – add more servers to a pool and distribute load among them.
Vertical scaling is easy for small systems but limited by hardware, OS, and downtime during upgrades. Horizontal scaling requires redesigning the architecture to be stateless and to handle distributed data.
3. Use Load Balancers
A load balancer (hardware or software such as HAProxy or Nginx) sits between clients and the server pool, routing traffic based on algorithms like round‑robin, least connections, fastest response, weighted, or IP hash. It improves availability and allows seamless addition or removal of servers.
4. Scale Relational Databases
When a single RDBMS cannot handle growth, employ techniques such as master‑slave replication, master‑master replication, federation, sharding, denormalization, and SQL tuning. Master‑slave replicates writes from the primary to one or more replicas; master‑master allows each node to accept writes and synchronizes them.
5. Choose the Right Database
SQL databases (MySQL, Oracle, PostgreSQL, etc.) store data in rows and columns, while NoSQL databases (key‑value, document, column‑family, graph, blob) offer flexible schemas and horizontal scalability. The choice depends on consistency, query patterns, and workload characteristics.
6. Horizontally Scale the Web Layer
Move session state out of the web tier into a shared datastore (SQL or NoSQL) to achieve a stateless architecture. Stateless services can be replicated freely, and the load balancer can route any request to any instance.
7. Advanced Concepts
Introduce caching layers (in‑memory caches like Redis) to reduce database load, and use a Content Delivery Network (CDN) to serve static assets from edge locations. Deploy multiple data centers worldwide and use GeoDNS to resolve users to the nearest center, further reducing latency.
8. What Comes Next?
Future topics include combining sharding with replication, long‑polling vs. WebSockets vs. Server‑Sent Events, indexing and proxying strategies, SQL performance tuning, and elastic compute resources.
By iteratively applying these techniques—stateless design, load balancing, caching, multi‑region deployment, CDN, and data‑layer sharding—developers can scale a system to serve over 100 million users.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.