Mastering Rate Limiting and Circuit Breaking: Practical Guidelines with Sentinel
This article explains how to identify interfaces that need pre‑emptive rate limiting, allocate resources for isolation, apply circuit‑breaking wisely, calculate thresholds based on resource capacity, and leverage Sentinel's features to ensure stable backend performance.
1.1 Rate Limiting
Identify interfaces that require pre‑emptive limiting, such as those with volatile performance, slow response, high per‑request resource consumption, large request volumes, potential bottleneck resources, unpredictable downstream behavior, or heavy reliance on external services.
Frequently problematic interfaces with large performance swings.
Slow interfaces that hold resources for extended periods.
Interfaces that consume many resources per request.
High‑volume interfaces that occupy a large share of total resources.
Resources that can become bottlenecks, e.g., serial processing or long transactions.
Interfaces with highly variable request volume, such as promotional APIs.
APIs exposed to external systems with unpredictable changes.
Business logic that can accumulate and affect other services.
Downstream‑dependent services with unpredictable performance, especially when relying on big‑data or public data services.
1.1.2 Resource Allocation Flow Control
For generic services or those with multiple upstreams, allocate resources to ensure isolation and prevent cascading failures.
Common public components providing interfaces such as ES.
Services serving multiple business units to avoid chain reactions.
1.2 Circuit Breaking
Circuit breaking is needed when there are logic errors, backlogs, or high load that can affect data accuracy or cause cascading failures.
1.2.1 Breaker Impacts
Potential data inaccuracy: each extra request may produce dirty data.
Logical errors that propagate incorrect data to other systems.
Slow interfaces causing chain reactions.
System already under high load with existing backlog.
1.2.2 Precautions
Avoid arbitrary breaking or throttling at any node without evaluating data impact and the possibility of compensation or recovery.
2.1 Threshold Setting
Thresholds must satisfy two aspects:
Target performance expectations.
Allowed resource usage, not based on total system resources. Reserve capacity (e.g., core systems reserve three times capacity, keeping normal usage below 30%).
Threshold = Allocated resource capacity × Reserve multiplier
Set thresholds based on actual needs and allocatable resources, not on load‑test numbers.
2.2 Basis for Setting
Load testing.
Historical monitoring observations.
Estimation: when load testing is unavailable, estimate average resource consumption per request and compute the request count that the allowed resource amount can sustain.
2.2.1 Capacity Estimation Calculation
Overall approach: estimate average consumption of each resource along the chain and divide total resources by this average.
Per‑second interface processing time = QPS × average response time 175qps * 32ms = 5.6s CPU 100% utilization per‑second processing time = actual seconds / CPU usage 5.6s / 15% (cpu usage) = 37s Maximum throughput threads needed = 37 (default thread pool size 200) → 37 < 200.
CPU : IO non‑CPU time ratio = cores : CPU 100% time 2 / 37s (cpu cores) = 1:19 Maximum achievable QPS = 37s / 0.032s = 1100 qps = 1100 qps Assuming 25 threads, throughput = 25 / 0.032 = 781 qps = 781 Since 781 < 1100, threads become the CPU bottleneck.
Database connections (e.g., 10 connections, 20 ms per request): 10 / 20ms = 500 qps 500 < 781, so DB connections become another bottleneck. Final maximum throughput = min(CPU, threads, connections) = 500 qps.
Reverse‑calculate thresholds from allowed resources.
3. Sentinel Supported Features
Source‑based flow control : set different rules for different sources (IP, service name, etc.).
Hotspot parameter limiting : count and limit based on specific parameter values.
System performance metrics : CPU, RT, QPS (currently limited functionality).
Interface grading and batch downgrade : prioritize core functions during failures.
4. Recommendations
Perform timely traffic estimation, scaling, and optimization to maintain normal operation and avoid relying on circuit breaking, which should be a fallback for abnormal situations.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
