Mastering Rate Limiting, Degradation, and Circuit Breaking for Resilient Microservices
This article explains the concepts of rate limiting, degradation, and circuit breaking in microservice architectures, illustrating passive and active throttling strategies, practical examples of async conversion, various degradation techniques, and circuit‑breaker mechanisms with real‑world tools like Sentinel and Hystrix.
Rate Limiting – Understanding Limits
Rate limiting protects a service from exceeding its capacity. Two major categories exist:
Static (passive) limits : thresholds such as maximum QPS, CPU usage, or memory are defined manually by operators based on experience. They are simple to configure but cannot adapt to traffic spikes.
Adaptive limits : thresholds are adjusted in real time according to system metrics. Alibaba’s open‑source Sentinel provides several adaptive rules:
Adaptive rules in Sentinel
Load‑based: when load1 exceeds a configured value **and** the number of concurrent threads is higher than the estimated capacity, protection is triggered.
CPU usage: protection activates when CPU usage crosses a configurable ratio (0.0‑1.0).
Average response time (RT): protection triggers when the average RT of inbound traffic exceeds a millisecond threshold.
Concurrent threads: protection triggers when the concurrent thread count on a node reaches a defined limit.
Inbound QPS: protection triggers when request QPS on a node exceeds a configured threshold.
Active (proactive) limiting
When downstream services have lower capacity, the upstream system should reduce calls. Two common implementations are:
Post‑limit (feedback‑based) : each node reports its request count, a central controller compares the aggregate with thresholds, and nodes are instructed to apply proportional throttling.
Pre‑limit (token‑bucket) : a central token service issues tokens according to the global capacity. A node must acquire a token before processing a request, providing precise and elegant flow control.
Degradation – Sacrificing Non‑Core Features to Preserve Core Functionality
During traffic spikes or high‑load periods, non‑essential features can be degraded to keep critical paths stable. Typical degradation strategies include:
Page degradation : hide or disable UI elements (e.g., points‑deduction) via a feature flag, preventing the front‑end from invoking the backend.
Storage degradation : replace frequent DB writes with cache writes and asynchronous message queues, reducing write pressure on the database (common in flash‑sale scenarios).
Read degradation : disable reads of non‑critical data (e.g., avatar images in a WeChat red‑packet list) when the system is under stress.
Write degradation : block certain write operations entirely during overload.
Circuit Breaking – Preventing Cascading Failures
A circuit breaker isolates a failing downstream service to avoid cascading failures across the call chain. When the error rate or latency of a downstream service exceeds a configured threshold, the breaker opens, pausing further calls while health checks continue. Once the service recovers, the breaker closes.
Key components
Algorithm : determines the conditions (error rate, latency, request volume) that trigger a break.
Fallback handling : provides alternative logic (e.g., default response, cached data) while the circuit is open.
Recovery : periodic health checks that close the circuit when the service stabilizes.
Popular implementations include Netflix Hystrix and Alibaba Sentinel. Both expose configuration for error‑rate thresholds, request‑volume windows, and fallback methods.
References
Sentinel adaptive flow control: https://github.com/alibaba/Sentinel/wiki/系统自适应限流
Hystrix documentation: https://github.com/Netflix/Hystrix/wiki
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
