Designing Scalable Systems for Billions of Users: Architecture, Scaling, and Database Strategies
This article explains how to evolve a simple single‑server web application into a highly available, horizontally scalable system by introducing DNS routing, vertical and horizontal scaling, load‑balancing, database replication, sharding, denormalization, SQL/NoSQL choices, caching, CDN, and GeoDNS techniques.
Designing a system that can support billions of users is challenging; the article starts with a basic single‑server setup (web server on Apache/Tomcat and an Oracle/MySQL database) and points out its single points of failure.
It introduces using a DNS server to resolve hostnames to IP addresses, allowing clients to locate the system’s entry point.
Vertical scaling (scale‑up) is described as adding more CPU, memory, storage, or network capacity to an existing server, with examples such as increasing RAM from 8 GB to 32 GB, adding disks to a RAID array, switching to SSDs, or upgrading network interfaces. Limitations include hardware cost, OS constraints, and required downtime.
Horizontal scaling (scale‑out) involves adding multiple servers to distribute load, separating web and database layers, and using load balancers to route traffic. The article explains that horizontal scaling is more complex but provides better fault tolerance.
Load‑balancing techniques are covered, including round‑robin, least connections, fastest response, weighted, and IP‑hash algorithms, with HAProxy and Nginx cited as popular open‑source solutions.
The discussion then moves to 纵向扩展 vs 横向扩展 of relational databases, covering replication (master‑slave, master‑master), sharding, partitioning, and denormalization. It explains how master‑slave replication copies data from a primary to one or more replicas, and master‑master allows all nodes to accept writes, requiring conflict resolution.
Sharding (horizontal partitioning) splits data across multiple servers, improving performance and manageability; vertical partitioning groups tables by functionality; directory‑based partitioning uses a lookup service to map entities to shards.
Denormalization is presented as a way to improve read performance by duplicating data across tables, at the cost of write complexity, with examples such as materialized views in PostgreSQL and Oracle.
The article compares SQL (relational databases like MySQL, Oracle, PostgreSQL) with NoSQL (key‑value, document, column‑family, graph, and blob stores), showing code snippets for key‑value pairs using key and value .
Advanced concepts such as caching, content delivery networks (CDN), and GeoDNS are introduced to reduce latency and improve availability, explaining that caches can sit at the database, web, or network layer, while CDNs store static assets closer to users.
Finally, the article emphasizes that scaling is an iterative process that combines stateless architecture, load balancers, extensive caching, multi‑data‑center deployment, CDN usage, and database sharding to support over 100 million users.
Author: Anh T. Dang (InfoQ Architecture Headlines).
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.