Design and Implementation of Rate Limiting and Circuit Breaking in Microservice Architecture
This article explains the motivations, concepts, resource granularity, rule definitions, and two‑stage sliding‑window computation needed to design and implement effective rate limiting and circuit breaking mechanisms for microservice APIs and API gateways, ensuring isolated failures do not cascade across services.
Problem and Background
Describes why rate limiting and circuit breaking are essential in microservice architectures, illustrating scenarios where a single API service can exhaust threads, memory, or cause cascading failures across dependent services.
Basic Concepts
Defines rate limiting as queuing requests with a fixed thread‑pool size and circuit breaking as rendering an entire service unavailable once configured thresholds are breached.
Resource Granularity
Introduces three granularity levels for control: API consumer + API service + API provider , API service + API provider , and API provider‑wide , each requiring distinct rule configuration.
Rule Definition
Outlines key dimensions—service latency, request count per time unit, and data volume—and shows how simple threshold rules (e.g., count > threshold) and composite rules (AND/OR) can be applied across the different granularity levels.
Computation Logic
Explains a two‑stage aggregation process: first, collect raw instance data in a minimal 10‑second bucket; second, push aggregated results into a sliding‑window array and perform a secondary aggregation over the configured time window (e.g., 5 minutes) to evaluate rule satisfaction.
Overall Implementation Flow
Match service instances to configured resource granularity and store in a temporary buffer.
Perform first‑level aggregation.
Push aggregated data into the sliding‑window.
Execute second‑level aggregation based on rule definitions.
Determine whether to trigger rate limiting or circuit breaking and act accordingly.
Decoupling from API Gateway
Shows that the limiter can be an independent interceptor that checks the current rule state before allowing or rejecting a request, with optional recovery timers to automatically restore service availability after a cooldown period.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.