Understanding Rate Limiting, Degradation, and Circuit Breaking in Distributed Systems
This article explains the concepts of rate limiting, service degradation, and circuit breaking, illustrating passive and active throttling strategies, asynchronous processing, and practical examples such as Alibaba Sentinel, token‑based controls, and Hystrix, to help engineers design resilient, high‑availability systems.
Part 1 – Rate Limiting: Self‑Awareness and Insight
Systems must recognize their own capacity and the capacity of downstream services; when traffic exceeds these limits, protective mechanisms become essential.
1.1 Passive Rate Limiting (Self‑Awareness)
Define clear capacity limits and reject excess requests. Two common approaches are static thresholds/rules and adaptive strategies that adjust limits based on real‑time load, CPU usage, average response time, concurrent threads, or QPS. Alibaba’s open‑source Sentinel implements such adaptive limits.
Load‑based: triggers protection when system load1 exceeds a preset value and concurrent threads surpass estimated capacity.
CPU usage: triggers when CPU usage exceeds a configurable threshold (0.0‑1.0).
Average RT: triggers when average request latency reaches a defined millisecond threshold.
Concurrent threads: triggers when the number of concurrent threads on a machine reaches a limit.
Ingress QPS: triggers when inbound QPS exceeds a threshold.
1.2 Active Rate Limiting (Insight)
When downstream services have limited capacity, callers should proportionally reduce requests. Combining cluster‑wide and single‑node throttling is advisable, especially when downstream instances differ significantly in capability.
One solution collects request logs from service nodes, compares them with configured thresholds, and feeds back to each node for proportional throttling (post‑throttling).
Another solution uses a central token‑issuing service; nodes must acquire a token before proceeding, providing precise and elegant pre‑throttling.
1.3 Synchronous‑to‑Asynchronous Conversion
When downstream processing is slower (e.g., third‑party payment settlement), the front‑end can complete the user action immediately and defer the final confirmation to an asynchronous workflow, reducing peak load and improving overall availability.
Part 2 – Degradation: Sacrificing Minor Features to Preserve Core Functionality
During traffic spikes, non‑essential services can be degraded to protect primary business flows.
1. Page Degradation : Hide or disable UI elements (e.g., points‑deduction entry) via a feature‑toggle platform.
2. Storage Degradation : Replace frequent DB writes with cache writes and asynchronous MQ messages, as commonly done in flash‑sale systems.
3. Read Degradation : Disable non‑critical read requests (e.g., avatar fetching in a red‑packet list) under high load.
4. Write Degradation : Block certain write operations entirely when the system is under pressure.
In short, degradation trades a small loss of user experience for overall system stability.
Part 3 – Circuit Breaking: Maintaining a Global View
Circuit breaking prevents cascading failures and service avalanches by temporarily halting calls to an unhealthy downstream service while monitoring its recovery.
The Hystrix circuit‑breaker flow includes three key steps: determining when to open the circuit (algorithm), providing fallback logic during the open state, and detecting recovery to close the circuit.
In practice, monitoring downstream storage errors can trigger a switch that routes traffic to a fallback message queue, ensuring most requests continue processing while the primary storage recovers.
Recommended Reading:
How to Ensure MQ Message Order?
MySQL Open‑Source Tool Collection
What Is a Bloom Filter? Solving High‑Concurrency Cache Penetration
Using Binlog for Cross‑System Data Synchronization
High‑Concurrency Service Optimization: Detailed RPC Call Process
Designing a High‑Performance Flash‑Sale System
Follow the public WeChat account “Internet Full‑Stack Architecture” for more valuable insights.
Full-Stack Internet Architecture
Introducing full-stack Internet architecture technologies centered on Java
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.