Designing Effective Rate Limiting and Circuit Breaking for Microservice APIs
This article examines the motivations, resource granularity, rule definition, and calculation logic behind implementing rate limiting and circuit breaking in microservice architectures, using examples like Sentinel and Hystrix, and outlines a step-by-step design for integrating these controls with API gateways.
Problem and Background
When a single API service receives massive concurrent calls or long‑running requests, it can exhaust thread pools, memory, or cause JVM out‑of‑memory crashes. This creates a cascade failure (snow‑ball effect) where the faulty service degrades the whole micro‑service system. Rate limiting and circuit breaking are used to isolate such failures.
Basic Concepts
Rate limiting queues incoming requests and caps the number of concurrent threads; excess requests wait until resources become available. Circuit breaking disables an entire service when a rule is triggered, making the service unavailable to all callers.
Rate limiting usually targets a consumer‑API pair, while circuit breaking targets the whole API provider.
Resource Granularity
Fine‑grained : consumer + API + provider
Circuit‑break layer : API + provider
Circuit‑break scope : provider (all services offered by the provider)
Rule Definition
Rules are expressed as thresholds on metrics such as call count, latency, data volume, failure count, or success rate. A rule is satisfied when a metric exceeds (or falls below) its configured threshold. Composite rules combine multiple conditions with logical AND/OR.
Reject all traffic to a user‑query API if calls > 10 000 within 5 minutes.
Reject CRM access to a product‑query API if average latency > 30 s within 10 minutes.
Trigger full circuit break for an order‑update API if failure rate > 1 % within the configured window.
Computation Logic
The computation is performed in two stages to avoid storing raw request data for the whole window.
Match each incoming request to the configured resource granularity and place it in a temporary buffer.
Aggregate the buffered data over a minimal interval (e.g., 10 seconds) – this is the first aggregation.
Push the aggregated snapshot into a sliding‑window array that stores the last N intervals.
When a rule’s time window is reached, perform a second aggregation over the values in the sliding window.
Evaluate the rule; if satisfied, trigger rate limiting or circuit breaking.
Implementation Flow
Each independent rule (different granularity or metric) maintains its own buffer and sliding‑window storage.
Rule 1: Limit getCustomer calls – reject if > 10 k calls in 10 minutes.
Rule 2: Flow‑control getProductInfo – circuit break if error rate > 1 % in 5 minutes.
Rule 3: ERP services – circuit break if average latency > 30 s in 1 minute.
Every 10 seconds each rule’s buffer is aggregated, the result is pushed to its sliding window, and the window is re‑evaluated to decide whether to enforce limiting or breaking.
Decoupling from API‑Gateway
Rate limiting and circuit breaking are implemented as independent interceptors. The interceptor examines the incoming request, checks the current rule state, and either rejects the request or lets it pass. When a service is circuit‑broken, a recovery timer (e.g., 5–10 minutes) is configured; a scheduled task clears the blocked state once recovery conditions are met. This design keeps the API‑gateway lightweight while delegating protection decisions to the dedicated limiter/breaker component.
Illustrative Diagrams
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
