Designing a Scalable High‑Concurrency Distributed Backend for Millions of PV Sites

This article outlines a comprehensive design for a high‑concurrency distributed backend system that handles tens of millions of page views, detailing group‑based data partitioning, master‑slave roles, consistency models, heartbeat services, and global coordination to achieve scalability and fault tolerance.

21CTO
21CTO
21CTO
Designing a Scalable High‑Concurrency Distributed Backend for Millions of PV Sites

The purpose of this article is to consolidate the knowledge acquired this year and extend the previous piece "Overview of High‑Performance Distributed Computing and Storage System Design". The author acknowledges limited expertise and invites discussion and correction.

Since the end of 2010, the author has been researching high‑concurrency, high‑performance servers and distributed systems for about three years, admitting many concepts remain only partially understood.

The target system is a backend solution for websites with tens of millions of daily page views, supporting scenarios such as micro‑blogging, social networks, ad delivery, and email services.

To handle massive traffic, the system employs vertical and horizontal partitioning: different business logic is split across separate servers (vertical), and the same business load is distributed across multiple servers (horizontal). The design also addresses backup, scaling, and failure handling.

Data storage follows two principles: (1) each business stores its own data in a distinct group; (2) each group stores multiple copies of the same data, with some replicas offering read‑write access and others read‑only.

Each group consists of a Group Master (primary node handling reads and writes) and one or more Group Slaves (providing read‑only access and synchronizing with the master). The master‑slave relationship can be configured for strong consistency or eventual consistency depending on business requirements. For a micro‑blogging service, eventual consistency is acceptable; for critical services like shopping carts, strong consistency is required.

The system adopts a "semi‑synchronous" write mode: a write is considered successful once at least one slave has synchronized the data, ensuring continuity if the master fails. Full synchronization would be required for strong‑consistency systems, but this incurs higher latency.

Failure detection and master election rely on a heartbeat service and distributed election protocols (e.g., Paxos). When a Group Master fails, a slave acquires a lock from the heartbeat service and becomes the new master.

A Global Master node manages all groups, handling configuration, health monitoring, heartbeat services, and initial client requests to locate the appropriate group. Clients cache group information after the first request to reduce load on the Global Master, optionally using DNS load balancing.

The Global Master has a corresponding Global Slave that maintains strong consistency with the master, allowing immediate takeover in case of master failure.

Overall system metrics include consistency (strong for global nodes, eventual for group nodes), availability (high availability for global nodes, partition tolerance for groups), replication (full sync for global nodes, semi‑sync for groups), and fault recovery (minimal write interruption during master failover).

Additional considerations cover backup slaves to prevent cascading failures, consistent hashing for data redistribution, log snapshotting and copy‑on‑write for recovery, and optional front‑end components such as reverse proxies.

Images illustrating the system architecture:

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backend ArchitectureScalabilityhigh concurrencyConsistency
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.