Backend Development 20 min read

From Single Server to Scalable Architecture: Key Lessons from Large‑Scale Site Design

This comprehensive note distills the evolution of large‑website architecture—from single‑server setups to layered, distributed, and highly available systems—covering caching, clustering, read/write separation, CDN, NoSQL, business splitting, scalability, extensibility, and automation strategies.

21CTO

May 5, 2018

From Single Server to Scalable Architecture: Key Lessons from Large‑Scale Site Design

Website Evolution Overview

The author finished reading Large Site Architecture and created personal notes to retain the insights, focusing on the evolution of a website’s technical stack.

Initial Stage

All resources (application, database, files) reside on a single server due to low traffic.

Application and Data Service Separation

As traffic grows, the application server is split from file and database servers, each optimized for CPU, storage, or memory respectively.

Caching for Performance

Adding cache layers (local and distributed) accelerates access and reduces database load; the principle "if a problem isn’t solved by adding a cache, add another cache" is highlighted.

Application Server Clustering

When a single server becomes a bottleneck, clustering with load balancing enables horizontal scaling and stateless design.

Database Read/Write Separation

Writes go to the master, reads to replicas, with an added data access layer to keep the application unaware of the split.

Reverse Proxy and CDN Acceleration

Both act as caches; CDNs sit in ISP data centers, while reverse proxies are deployed in the central data center to reduce latency and backend load.

Distributed File and Database Systems

Physical disk limits and relational DB scaling constraints lead to distributed storage and database proxies (e.g., Cobar, Mycat).

NoSQL and Search Engines

NoSQL offers flexible, distributed storage for Web 2.0 workloads, while dedicated search engines offload query processing from databases.

Business Splitting

Large sites decompose into independent applications linked via hyperlinks, message queues, or shared data stores.

Distributed Services

Common business functions (e.g., user management) are extracted into shared services, enabling independent development and centralized monitoring.

Core Architectural Elements

The five key concerns are performance, availability, scalability, extensibility, and security.

Architecture Patterns

Layered : Separate application, service, and data layers for decoupling and scalability.

Partitioning : Vertical splitting of business modules for high cohesion and low coupling.

Distributed : Deploy layers across multiple machines, addressing network latency, consistency, and management complexity.

Clustering : Use multiple machines to handle load, especially when services are stateless.

Caching : Local, CDN, reverse proxy, and distributed caches to speed up data access.

Asynchronous : Producer‑consumer models improve decoupling, availability, and traffic shaping.

Redundancy : Replication ensures data reliability and service continuity.

Automation : Automate deployment, testing, monitoring, failover, and scaling to boost productivity.

Performance

Performance metrics differ for users (response time, browser optimizations, CDN), developers (latency, throughput, stability), and operations (resource utilization, network, virtualization).

System throughput, concurrency, and response time can be visualized as highway traffic: higher vehicle count increases revenue up to a point, after which congestion reduces speed and revenue.

Web Front‑End Optimization

Reduce HTTP requests, enable browser caching, compress assets, and use CDN/reverse proxy.

Application Server Optimization

Four main tactics: caching, clustering, asynchronous processing, and code optimization.

Caching

First rule of performance optimization: prioritize caching.

Consider cache eviction and consistency (lease or versioning). Not all data suits caching—frequently changing or cold data should bypass it. Ensure high cache availability and guard against cache penetration by caching null results.

Code Optimization

Multithreading: leverage multiple cores to avoid I/O blocking; aim for full CPU utilization.

Resource reuse: employ pools (thread pools, connection pools) to reduce overhead.

High Availability

Availability is measured by the proportion of uptime in a year (e.g., 99.9% = three nines ).

Application Layer HA

Stateless servers behind a load balancer can be removed from service via health checks; session state can be handled via IP hash, cookies, or dedicated session stores (e.g., Redis).

Service Layer HA

Use distributed service frameworks with service registry, heartbeat detection, client‑side load balancing, timeout settings, asynchronous calls, degradation strategies, and idempotent design.

Data Layer HA

Redundant replicas and failover mechanisms (heartbeat detection, control‑center coordination) protect data; consistency remains a core challenge.

Scalability (Scalability)

Scaling means adjusting server count without redesigning software or hardware.

Application Layer Scalability

Design stateless services and use clustering with load balancing; reference to the author's previous article on load balancing.

Cache Scalability

Distributed caches are stateful; consistent hashing minimizes cache misses when adding nodes.

Data Layer Scalability

Relational databases rely on distributed proxies; NoSQL sacrifices SQL and strong consistency for high availability and scalability.

Extensibility (Extensibility)

Modular design reduces coupling and enhances reuse; achieved via distributed message queues and services.

Distributed Services

Vertical splitting creates independent web apps; horizontal splitting extracts common business logic into services with stable interfaces.

Service Governance Framework

Provides registration, discovery, load balancing, failover, efficient RPC, heterogeneous integration, minimal intrusion, version management, and real‑time monitoring.

Others

Problem identification and communication tips: frame issues as collective concerns, ask open‑ended questions to subordinates, use constructive language, and suggest improvements rather than criticize.

Author: XYBABY Source: http://www.cnblogs.com/xybaby/p/8907880.html

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems high availability large-scale architecture

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.