Designing a Scalable Backend Architecture for Millions of Daily Active Users
The article outlines a comprehensive backend architecture for handling millions of daily active users, covering DNS routing, layer‑4/7 load balancing, monolithic versus microservice deployment, caching, database sharding, hybrid‑cloud strategies, elastic scaling, and multi‑level degradation mechanisms.
Recent incidents such as the Xi'an One‑Code‑Pass outage highlight the importance of designing systems with sufficient scalability and automatic elasticity to handle traffic spikes that exceed normal levels by many times.
A typical request flow for a million‑level DAU (Daily Active Users) internet application passes through several layers: DNS resolution, layer‑4 load balancing, layer‑7 gateway (often Nginx), the application server, cache, and database.
DNS directs user requests to the appropriate IDC region based on IP, with caching to keep the mapping stable.
Layer‑4 Load Balancing (e.g., LVS) forwards traffic to the correct backend gateway cluster; soft load balancers are typically used for cost‑effectiveness.
Layer‑7 Load Balancing (Gateway) such as Nginx handles application‑level routing, authentication, logging, and monitoring, allowing fine‑grained traffic distribution.
The server layer can be deployed as a monolithic application for simple services or split into microservices when the codebase and team size grow, improving development efficiency and scalability.
Cache (e.g., Redis, Memcached) stores hot data to reduce database latency; a typical single node can handle hundreds of thousands of QPS with millisecond response times.
Database must support high availability through master‑slave replication, sharding, and partitioning. Large‑scale systems often split data by time and user ID to keep individual tables under tens of millions of rows.
For tens of millions of DAU, a Hybrid Cloud Architecture is recommended: primary traffic runs in private IDC, while overflow traffic is offloaded to public cloud resources via dedicated inter‑connects.
Elastic scaling across the full stack (layer‑4/7, servers, cache, database) is essential. Public‑cloud SLB can handle millions of concurrent connections, while Nginx instances can be added dynamically based on QPS.
Automatic scaling decisions should consider not only QPS but also latency distribution and slow‑request ratios; open‑source projects like CudgX provide multi‑dimensional metrics and machine‑learning‑based scaling.
When rapid traffic growth outpaces cache or database scaling, pre‑allocated redundancy and multi‑level degradation mechanisms are needed. The three‑level degradation strategy reduces functionality progressively, releasing 30‑100% of capacity while minimizing user impact.
In summary, a robust high‑traffic system combines DNS routing, layer‑4/7 load balancing, microservice architecture, cache and database sharding, hybrid‑cloud deployment, full‑stack elastic scaling, and tiered degradation to maintain availability at the ten‑million‑plus DAU scale.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.