How to Build Truly High‑Availability Stateless Services: Strategies & Algorithms
This article explains how to design highly available stateless services by covering redundancy, vertical and horizontal scaling, load‑balancing algorithms, high‑concurrency identification, and the use of CDN/OSS, offering practical guidance for robust backend architecture.
Stateless Service High Availability
Accidents are the result of accumulated load; as user numbers grow, ignoring high‑availability inevitably leads to failures. Designing a highly available system requires considering redundancy, monitoring, automated recovery, performance, error handling, and graceful degradation such as rate limiting and circuit breaking.
Redundant Deployment
Deploy multiple nodes to avoid single‑point failures, use vertical scaling to boost single‑machine performance, and horizontal scaling to quickly add capacity during traffic spikes.
Load Balancing Algorithms
Four basic algorithms are available: random, round‑robin, weighted round‑robin, weighted random, least‑connections, and source‑address hash. Weighted round‑robin assigns higher weight to servers with greater capacity, while least‑connections selects the server with the fewest active connections.
Choosing an Algorithm
Start with simple round‑robin for uniformly configured servers; use weighted round‑robin or least‑connections when multiple applications share a server. For short‑connection scenarios (e.g., HTTP), prefer weighted round‑robin with cookie‑based session persistence; for long‑connection services (FTP, sockets) use weighted least‑connections.
Identifying High Concurrency
High concurrency is measured by QPS. Example formulas: peak QPS = (100 000 × 80%)/(86 400 × 20%) ≈ 4.6 QPS; 50 000 machines each handling one request per minute yield 833 QPS. Generally, a few hundred QPS indicates high concurrency.
Vertical Scaling
Increase a single machine’s resources via CPU, memory, SSD, or system tuning, and improve software with async processing, caching, and lock‑free structures. While fast, vertical scaling has limits and creates a single point of failure.
Horizontal Auto‑Scaling
When load rises, add nodes automatically. Implement custom schedulers in private clouds or use cloud provider elastic‑scaling services. For containers, configure auto‑scaling at the IaaS layer or within Kubernetes, ensuring stateless services are the target.
CDN and OSS
Static assets (images, videos, HTML/CSS/JS) should be cached via a CDN to reduce latency. Combine CDN with object storage (OSS) for unlimited media storage and archival of cold data. This improves user experience and offloads backend traffic.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITFLY8 Architecture Home
ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
