Backend Development 13 min read

Mastering High Availability: 9 Essential Design Techniques for Scalable Systems

The article walks through nine practical techniques—system splitting, decoupling, asynchronous processing, retry, compensation, backup, multi‑active deployment, rate limiting, circuit breaking, and degradation—explaining why each is needed, how they are implemented in real‑world microservice architectures, and what trade‑offs to consider.

Architect

Apr 4, 2024

Mastering High Availability: 9 Essential Design Techniques for Scalable Systems

1. System Splitting

Large monolithic applications (e.g., an e‑commerce platform where membership, product, order, logistics, and marketing modules coexist) cannot scale during peak events; a single failure brings down the whole system. By adopting the "four‑piece set" of high concurrency, high performance, high availability, and high scalability, architects split the monolith into independent services using Domain‑Driven Design (DDD). Each sub‑system owns a vertical business domain, isolates boundaries, and reduces cascade failures.

2. Decoupling

The principle of "high cohesion, low coupling" is applied from interface abstraction and MVC layers up to SOLID principles and the 23 design patterns. The article uses the Open‑Closed Principle as an example: extensions are open, modifications are closed. Spring provides AOP (Aspect‑Oriented Programming) to intercept method calls via dynamic proxies, allowing additional logic without modifying existing code. An event‑driven publish/subscribe model further isolates changes: new features publish events, and interested listeners consume them without touching the original code.

3. Asynchronous Processing

Synchronous calls block a thread until a response arrives, wasting resources under high load. By moving non‑real‑time actions (e.g., sending SMS, generating order snapshots) to asynchronous mechanisms such as ThreadPoolExecutor or message queues, the main flow continues while background tasks handle the work.

4. Retry

Remote RPC calls suffer from network jitter and thread blockage. A retry strategy resends failed requests, similar to a browser's F5 refresh. However, blind retries can cause duplicate operations (e.g., double bank transfers). The article stresses coupling retry with idempotency: check‑then‑insert, unique indexes, version tables, state machines, distributed locks, or token mechanisms ensure safe repeated calls.

5. Compensation

When a request cannot be guaranteed to succeed, compensation achieves eventual consistency. The article distinguishes forward compensation (pushing partially successful tasks to a successful state) and reverse compensation (rolling back to the initial state). Implementation examples include a local table scanned by scheduled jobs, or a simple message‑queue consumer that retries failed steps.

6. Backup

Data loss from server crashes is unacceptable. Using Redis as an example, the article explains RDB (full snapshot) and AOF (append‑only log) for data synchronization between primary and replica. Sentinel provides automatic failover through monitoring, master election, and notification. Similar backup mechanisms exist for MySQL, Kafka, HBase, and Elasticsearch.

7. Multi‑Active Strategy

To survive catastrophic events (power outage, fire, earthquake), multi‑active deployments replicate services across regions. Common patterns include same‑city dual‑active, two‑region three‑center, three‑region five‑center, cross‑region dual‑active, and cross‑region multi‑active. Each pattern varies in technical requirements, construction cost, and O&M overhead.

8. Rate Limiting

When traffic spikes exceed system capacity, two approaches exist: accept all requests (risking overload) or discard excess traffic. Rate limiting caps concurrent requests to keep the system responsive. The article categorises limits by scope (global, per‑API, per‑IP/device/user) and lists algorithms: counter, sliding window, leaky bucket, and token bucket. Single‑node limits use in‑memory counters (e.g., AtomicLong#incrementAndGet()), while distributed limits coordinate across a cluster.

9. Circuit Breaking and Degradation

Circuit breaking monitors downstream resource health; when failures exceed a threshold, the circuit opens, instantly rejecting calls to prevent cascade failures. The three states—Closed, Open, Half‑Open—are described with their transition logic. Alibaba’s open‑source Sentinel provides a dashboard for rule configuration. Degradation temporarily disables non‑core features (e.g., product reviews, transaction records) during peak load, preserving critical paths like order creation and payment. The article emphasizes that degradation strategies must be tailored to business needs and agreed upon with product owners.

Overall, mastering these nine techniques equips architects to design highly available, resilient, and scalable backend systems capable of handling massive traffic while maintaining data integrity and user experience.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems Microservices High Availability System Design fault tolerance rate limiting circuit breaker

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.