Mastering System Resilience: Rate Limiting, Circuit Breaking, and Degradation
To keep systems highly available under sudden traffic spikes, developers employ three core strategies—rate limiting, circuit breaking, and service degradation—each controlling request flow, isolating failures, and gracefully reducing functionality to maintain stability, with practical examples and algorithmic approaches explained.
Ensuring high availability requires many measures, such as using Redis caching to reduce database load while handling cache penetration, avalanche, and breakdown issues. When facing sudden traffic spikes, three distributed architecture techniques—rate limiting, circuit breaking, and degradation—are essential.
Rate Limiting
Rate limiting is simple to understand: like the Forbidden City selling only 80,000 tickets per day, excess visitors are denied entry to avoid overcrowding and safety hazards. In software architecture, rate limiting allows only a portion of incoming traffic while rejecting the rest.
Common algorithms include counter method, sliding window, leaky bucket, and token bucket. The token bucket resembles the ticket analogy: tokens represent permission to access the service.
Rate limiting can also be applied per system or business flow; for example, when core system A is overloaded, calls from less critical system C can be limited while preserving access for important system B.
Circuit Breaking
In real life, a fuse acts as a circuit breaker, automatically disconnecting power during a short circuit to protect appliances. In distributed systems, a call chain A→B→C→D may suffer when downstream D reaches its capacity, causing slow responses and potential cascading failures.
Circuit breaking detects a surge of timeouts, indicating a failure, and stops further requests to the problematic service, breaking the chain. It can also monitor downstream health and gradually restore traffic once the service recovers.
Service Degradation
Degradation can be automated in code: when traffic surges, lower‑priority services are limited or disabled, preserving critical functionality. It can also be switched manually during peak events (e.g., Double 11 or 618) to disable non‑essential features, ensuring core services like shopping and payment remain operational.
In summary, rate limiting, circuit breaking, and degradation are key mechanisms to maintain system stability under excessive traffic.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
