Backend Development 13 min read

Design and Implementation of Rate Limiting and Circuit Breaking in Microservice Architecture

This article explains the motivations, concepts, resource granularity, rule definitions, and two‑stage sliding‑window computation needed to design and implement effective rate limiting and circuit breaking mechanisms for microservice APIs and API gateways, ensuring isolated failures do not cascade across services.

Architecture Digest
Architecture Digest
Architecture Digest
Design and Implementation of Rate Limiting and Circuit Breaking in Microservice Architecture

Problem and Background

Describes why rate limiting and circuit breaking are essential in microservice architectures, illustrating scenarios where a single API service can exhaust threads, memory, or cause cascading failures across dependent services.

Basic Concepts

Defines rate limiting as queuing requests with a fixed thread‑pool size and circuit breaking as rendering an entire service unavailable once configured thresholds are breached.

Resource Granularity

Introduces three granularity levels for control: API consumer + API service + API provider , API service + API provider , and API provider‑wide , each requiring distinct rule configuration.

Rule Definition

Outlines key dimensions—service latency, request count per time unit, and data volume—and shows how simple threshold rules (e.g., count > threshold) and composite rules (AND/OR) can be applied across the different granularity levels.

Computation Logic

Explains a two‑stage aggregation process: first, collect raw instance data in a minimal 10‑second bucket; second, push aggregated results into a sliding‑window array and perform a secondary aggregation over the configured time window (e.g., 5 minutes) to evaluate rule satisfaction.

Overall Implementation Flow

Match service instances to configured resource granularity and store in a temporary buffer.

Perform first‑level aggregation.

Push aggregated data into the sliding‑window.

Execute second‑level aggregation based on rule definitions.

Determine whether to trigger rate limiting or circuit breaking and act accordingly.

Decoupling from API Gateway

Shows that the limiter can be an independent interceptor that checks the current rule state before allowing or rejecting a request, with optional recovery timers to automatically restore service availability after a cooldown period.

microservicesAPI GatewayRate LimitingSliding Windowcircuit breakingresource granularity
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.