Operations 12 min read

Mastering High Concurrency & High Availability: Core Principles for Scalable Systems

This article outlines essential principles for designing high‑concurrency and high‑availability systems, covering stateless architecture, service decomposition, caching strategies, message queues, data heterogeneity, degradation, rate limiting, traffic switching, rollback, and comprehensive business design rules such as idempotency, anti‑duplication, and documentation.

21CTO
21CTO
21CTO
Mastering High Concurrency & High Availability: Core Principles for Scalable Systems

1. High Concurrency Principles

1.1 Stateless

If the application is designed to be stateless, it is easier to scale horizontally. In practice, the application itself is stateless while configuration files are stateful.

1.2 Splitting

When traffic is large and resources are sufficient, consider splitting. Main splitting scenarios include:

System dimension: split by system function/business.

Function dimension: split a system by its functions.

Read‑write dimension: split based on read/write characteristics; use cache for heavy reads, sharding for heavy writes, and heterogeneous data splitting for aggregation.

AOP dimension: split according to access characteristics using AOP.

Module dimension: split based on foundational or code‑maintenance characteristics.

1.3 Serviceization

In‑process service → single‑machine remote service → cluster manual registration → automatic registration and discovery → service grouping/isolation/routing → service governance (rate limiting, black/white lists).

1.4 Message Queues

Message queues decouple services that do not require synchronous calls, enable one‑to‑many consumption, asynchronous processing, and traffic shaping/buffering.

1.5 Data Heterogeneity

1.5.1 Data Heterogeneity

Order tables are often sharded by order ID; querying a user's orders requires aggregating multiple tables, leading to low read performance. To improve this, create a heterogeneous user‑order table sharded by user ID.

Additionally, archiving order data can enhance performance and stability.

1.5.2 Data Closed Loop

For pages like product details with many data sources, store used data heterogeneously to form a closed loop. Steps:

Data heterogeneity: receive data changes via MQ and atomically store them in suitable storage such as Redis or persistent KV stores.

Data aggregation: aggregate data from multiple sources, typically stored in KV for front‑end single‑call retrieval.

Front‑end presentation: front‑end obtains required data with one or few calls.

This approach ensures that even if dependent systems fail, the front‑end can still display data, though updates may be delayed.

When multiple data items are needed, use a HashTag mechanism to co‑locate related data in the same instance, e.g., using productId as a shard key for both basic info and specification data.

1.6 Cache “Silver Bullet”

Browser cache

App client cache

CDN cache

Edge layer cache

Application layer cache

Distributed cache

For fallback or abnormal data, caching should be avoided to prevent stale data from being shown to users for extended periods.

1.7 Concurrency

Parallelize serial behavior.

2. High Availability Principles

2.1 Degradation

Design a degradation switch with the following ideas:

Centralized management of switches via push mechanisms.

Multi‑level read service degradation: local cache, distributed cache, default degraded data (e.g., assume inventory is in stock).

Place switches at the ingress layer (e.g., Nginx) to route traffic selectively.

Business degradation: during traffic spikes, prioritize order placement and payment while ensuring eventual data consistency, possibly converting synchronous calls to asynchronous.

2.2 Rate Limiting

Purpose: prevent malicious traffic, attacks, or traffic exceeding system peaks.

Direct malicious requests to cache only.

Use Nginx limit module for traffic reaching backend.

Block malicious IPs with Nginx deny.

The principle is to limit traffic from reaching vulnerable application layers.

2.3 Traffic Switching

For large applications, traffic switching is vital when a data center, rack, or server fails. Methods include:

DNS

HttpDNS

LVS/HaProxy

Nginx

2.4 Rollback

Versioning enables auditability, traceability, and rollback. Errors can be recovered by rolling back code, deployment, data, or static resources, ensuring high availability in certain scenarios.

3. Business Design Principles

3.1 Idempotent Design

An idempotent operation yields the same effect regardless of how many times it is executed with the same parameters.

3.2 Anti‑Duplication Design

Prevent duplicate payments, duplicate deductions, etc.

3.3 Process Definition

Reuse workflow systems to provide customizable process services.

3.4 State and State Machine

Transaction order systems have forward states (awaiting payment, awaiting shipment, shipped, completed) and reverse states (cancellation, refund). State design should include traceability for user tracking and logging, enabling issue backtracking.

3.5 Backend Operation Feedback

Design backend systems with preview and feedback capabilities.

3.6 Backend Approval Flow

Important backend functions (e.g., price adjustments) should have approval workflows and log operations for traceability and audit.

3.7 Documentation and Comments

Early‑stage systems should maintain documentation libraries (architecture, design ideas, data dictionary, business processes, known issues) and code should include comments for special requirements.

3.8 Backup

Backup both code and personnel. Code should be stored in repositories with versioning; at least two developers should understand each system.

4. Summary

System design must not only implement business functionality but also ensure high concurrency, high availability, and high reliability. It should consider capacity planning, SLA definition, monitoring and alerting, and emergency plans such as disaster recovery, degradation, rate limiting, isolation, traffic switching, and rollback.

Key high‑concurrency tactics include caching, asynchronous processing, connection pools, thread pools, scaling, message queues, and distributed tasks. High‑availability tactics include load balancing, reverse proxy traffic splitting, rate limiting, degradation, isolation, timeout/retry settings, and rollback mechanisms.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backend ArchitectureScalabilityhigh availabilitySystem Designhigh concurrency
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.