Prevent Service Avalanche: Circuit Breaker vs Degradation Strategies Explained
This article explains service avalanche in micro‑service chains, outlines its three failure stages, compares circuit‑breaker and degradation techniques, shows when to apply each, and provides practical guidance on tools like Sentinel and Resilience4j, testing, monitoring, and best‑practice configurations.
1. Service Avalanche – Chain Reaction
In an interview the candidate explains that a service avalanche is the cascading failure of dependent services, similar to a domino effect. When one service (e.g., inventory) fails, all downstream services (e.g., product) are impacted, especially in long micro‑service chains.
2. Circuit Breaker – Blocking Fault Propagation
The core goal of a circuit breaker is to prevent fault diffusion. It is suitable when a dependent service shows clear errors (e.g., high error rate or latency). A typical implementation uses a three‑state state machine: green (closed) – normal traffic passes; red (open) – calls are blocked and a friendly fallback is returned; yellow (half‑open) – a limited number of trial requests are allowed to decide whether to restore normal flow.
3. Degradation – Sacrificing Non‑Core Functions
Degradation aims to keep core functionality alive when the system itself runs out of resources (CPU, memory, QPS). During a traffic surge, non‑essential features such as product reviews or price history are disabled, freeing resources for critical paths like product details, add‑to‑cart, and order placement.
4. Practical Implementation
First, identify the root cause: dependency failure vs resource shortage. Choose a tool—Sentinel (well‑integrated with Spring Cloud/Dubbo) or Resilience4j (lightweight, no extra dependencies). Configure rules: for circuit breaking set error‑rate threshold, minimum request count, and open‑time; for degradation select non‑core resources and define QPS or CPU usage thresholds. Test in a staging environment using fault‑injection (e.g., Postman to force 500 errors) or load‑testing (e.g., JMeter to simulate peak QPS). Monitor the Sentinel dashboard or metrics to verify state transitions and fallback counts.
5. Summary
Circuit breakers and degradation are complementary “double‑insurance” mechanisms: circuit breakers stop fault propagation, while degradation protects core services under resource pressure. Proper scenario identification, tool selection, rule configuration, and pre‑deployment validation ensure system stability without sacrificing business availability.
NiuNiu MaTe
Joined Tencent (nicknamed "Goose Factory") through campus recruitment at a second‑tier university. Career path: Tencent → foreign firm → ByteDance → Tencent. Started as an interviewer at the foreign firm and hopes to help others.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
