Operations 7 min read

Mastering High Availability: When to Use Circuit Breaker, Rate Limiting, and Degradation

This article compares three core system‑level high‑availability strategies—circuit breaking, degradation, and rate limiting—explaining their definitions, typical scenarios, design principles, and technical value so leaders can assess ROI and choose the right protection mechanism.

Architecture Breakthrough

Apr 2, 2024

Circuit Breaker

Definition : In a service‑call architecture, when the callee repeatedly fails, the caller stops invoking it (opens the circuit) to avoid cascading failures.

Key components :

State machine with three states: Closed (normal), Open (calls short‑circuited), Half‑Open (test window).

Failure counter and configurable threshold (e.g., 5 failures within 10 s).

Timeout for the open state and retry logic to transition to half‑open.

Typical usage : Protect a service when downstream provider becomes unhealthy; return fallback response or error immediately.

Technical benefit : Prevents fault propagation, reduces thread/connection exhaustion, improves overall system stability.

Reference : Martin Fowler, “Circuit Breaker” (https://martinfowler.com/bliki/CircuitBreaker.html).

Degradation (Graceful Degradation)

Definition : During abnormal load or failures, non‑essential features are disabled via configuration switches, preserving core functionality.

Implementation pattern :

Feature‑toggle or flag stored in a centralized config service (e.g., Consul, etcd).

Code paths check the flag before executing low‑priority logic.

Toggle can be changed at runtime without redeploy.

Typical scenario : Large promotional events where auxiliary services (recommendations, analytics) are turned off to keep checkout stable.

Technical benefit : Frees CPU, memory, and I/O for critical paths, preventing system collapse under stress.

Reference : Zhihu article on degradation (https://zhuanlan.zhihu.com/p/664175526).

Rate Limiting

Definition : A self‑protection mechanism that rejects or delays incoming requests when the request rate exceeds the service’s processing capacity.

Common algorithms :

Token Bucket :

bucket_capacity = N
refill_rate = R tokens/second
on request:
    if bucket > 0:
        bucket -= 1   // allow
    else:
        reject
periodically:
    bucket = min(bucket + R*dt, bucket_capacity)

Leaky Bucket :

queue_capacity = Q
leak_rate = L requests/second
on request:
    if queue.size < Q:
        queue.enqueue(request)
    else:
        reject
every 1/L seconds:
    dequeue and process one request

Configuration parameters :

Maximum requests per window (e.g., 1000 rps).

Window size or token refill interval.

Burst size (bucket capacity) to allow short spikes.

Penalty action: immediate reject (HTTP 429), queue, or downgrade.

Typical usage :

Protect a service from overload.

Implement commercial throttling (different quotas per customer).

Prioritize premium traffic by assigning larger buckets.

Technical benefit : Guarantees that the service operates within its capacity, provides fair resource distribution, and reduces risk of denial‑of‑service.

References :

CloudXLab rate‑limiter design guide (https://cloudxlab.com/blog/syst-em-design-how-to-design-a-rate-limiter/).

Comparison of token bucket and leaky bucket (https://zhuanlan.zhihu.com/p/433041001).

operations high availability system reliability Rate Limiting circuit breaker degradation

Written by

Architecture Breakthrough

Focused on fintech, sharing experiences in financial services, architecture technology, and R&D management.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.