Mastering High Availability: 9 Essential Design Techniques for Scalable Systems
The article walks through nine practical techniques—system splitting, decoupling, asynchronous processing, retry, compensation, backup, multi‑active deployment, rate limiting, circuit breaking, and degradation—explaining why each is needed, how they are implemented in real‑world microservice architectures, and what trade‑offs to consider.
1. System Splitting
Large monolithic applications (e.g., an e‑commerce platform where membership, product, order, logistics, and marketing modules coexist) cannot scale during peak events; a single failure brings down the whole system. By adopting the "four‑piece set" of high concurrency, high performance, high availability, and high scalability, architects split the monolith into independent services using Domain‑Driven Design (DDD). Each sub‑system owns a vertical business domain, isolates boundaries, and reduces cascade failures.
2. Decoupling
The principle of "high cohesion, low coupling" is applied from interface abstraction and MVC layers up to SOLID principles and the 23 design patterns. The article uses the Open‑Closed Principle as an example: extensions are open, modifications are closed. Spring provides AOP (Aspect‑Oriented Programming) to intercept method calls via dynamic proxies, allowing additional logic without modifying existing code. An event‑driven publish/subscribe model further isolates changes: new features publish events, and interested listeners consume them without touching the original code.
3. Asynchronous Processing
Synchronous calls block a thread until a response arrives, wasting resources under high load. By moving non‑real‑time actions (e.g., sending SMS, generating order snapshots) to asynchronous mechanisms such as ThreadPoolExecutor or message queues, the main flow continues while background tasks handle the work.
4. Retry
Remote RPC calls suffer from network jitter and thread blockage. A retry strategy resends failed requests, similar to a browser's F5 refresh. However, blind retries can cause duplicate operations (e.g., double bank transfers). The article stresses coupling retry with idempotency: check‑then‑insert, unique indexes, version tables, state machines, distributed locks, or token mechanisms ensure safe repeated calls.
5. Compensation
When a request cannot be guaranteed to succeed, compensation achieves eventual consistency. The article distinguishes forward compensation (pushing partially successful tasks to a successful state) and reverse compensation (rolling back to the initial state). Implementation examples include a local table scanned by scheduled jobs, or a simple message‑queue consumer that retries failed steps.
6. Backup
Data loss from server crashes is unacceptable. Using Redis as an example, the article explains RDB (full snapshot) and AOF (append‑only log) for data synchronization between primary and replica. Sentinel provides automatic failover through monitoring, master election, and notification. Similar backup mechanisms exist for MySQL, Kafka, HBase, and Elasticsearch.
7. Multi‑Active Strategy
To survive catastrophic events (power outage, fire, earthquake), multi‑active deployments replicate services across regions. Common patterns include same‑city dual‑active, two‑region three‑center, three‑region five‑center, cross‑region dual‑active, and cross‑region multi‑active. Each pattern varies in technical requirements, construction cost, and O&M overhead.
8. Rate Limiting
When traffic spikes exceed system capacity, two approaches exist: accept all requests (risking overload) or discard excess traffic. Rate limiting caps concurrent requests to keep the system responsive. The article categorises limits by scope (global, per‑API, per‑IP/device/user) and lists algorithms: counter, sliding window, leaky bucket, and token bucket. Single‑node limits use in‑memory counters (e.g., AtomicLong#incrementAndGet()), while distributed limits coordinate across a cluster.
9. Circuit Breaking and Degradation
Circuit breaking monitors downstream resource health; when failures exceed a threshold, the circuit opens, instantly rejecting calls to prevent cascade failures. The three states—Closed, Open, Half‑Open—are described with their transition logic. Alibaba’s open‑source Sentinel provides a dashboard for rule configuration. Degradation temporarily disables non‑core features (e.g., product reviews, transaction records) during peak load, preserving critical paths like order creation and payment. The article emphasizes that degradation strategies must be tailored to business needs and agreed upon with product owners.
Overall, mastering these nine techniques equips architects to design highly available, resilient, and scalable backend systems capable of handling massive traffic while maintaining data integrity and user experience.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
