Operations 17 min read

Rate Limiting, Circuit Breaking, and Service Degradation: Key Fault‑Tolerance Patterns for Distributed Systems

The article explains why distributed systems need fault‑tolerance mechanisms such as rate limiting, circuit breaking, and service degradation, describes common metrics (TPS, HPS, QPS), outlines several limiting algorithms (counter, sliding window, leaky bucket, token bucket, distributed and Hystrix‑based), and discusses circuit‑breaker states, considerations, and practical Hystrix usage.

Architect

Mar 11, 2022

Rate Limiting, Circuit Breaking, and Service Degradation: Key Fault‑Tolerance Patterns for Distributed Systems

1 Rate Limiting

In distributed systems a faulty or slow service can block callers, exhaust resources, and cause a cascade failure (system avalanche). Proper rate‑limiting improves overall fault tolerance.

1.1 Rate‑Limiting Metrics

1.1.1 TPS

Transactions per second is a natural metric, but in practice a single transaction may involve many services and take a long time, making TPS too coarse‑grained.

1.1.2 HPS

Hits per second (requests received per second) measures raw request volume.

❝If a request completes a transaction, TPS and HPS are equivalent, but in distributed scenarios they differ because a transaction may span multiple requests.❞

1.1.3 QPS

Queries per second counts how many client queries the server can answer per second.

❝With a single server, HPS and QPS are the same, but in distributed setups each request may involve many servers, so they are not interchangeable.❞

1.2 Rate‑Limiting Methods

1.2.1 Counter

The simplest method limits the number of requests per second, e.g., reject any request beyond 100 per second.

Problem 1: Controlling the exact time window (e.g., 1 s) is difficult.

Problem 2: Short spikes may not require limiting, yet the counter would reject them.

1.2.2 Sliding Time Window

The sliding‑window algorithm treats time as a moving window. For a 1 s window with a limit of 50 requests, the sum of requests in t1~t5 must not exceed 250. When the window slides to t2~t6, the oldest slice is dropped and the newest added.

Advantages: solves the counter’s granularity problem. Drawbacks: still needs to drop traffic or degrade when the limit is exceeded, and cannot smooth short‑term spikes.

1.2.3 Leaky Bucket

The leaky‑bucket buffers incoming requests in a fixed‑size queue and releases them at a steady rate, preventing burst traffic from overwhelming the service.

Issues to consider: bucket size, output rate, and increased response latency.

1.2.4 Token Bucket

Clients must obtain a token before sending a request; tokens are replenished periodically. This algorithm combines burst tolerance with a steady rate and is widely used (e.g., Google Guava).

1.2.5 Distributed Rate Limiting

When the token bucket is stored centrally (e.g., in redis), every service in a distributed call chain must interact with it, increasing latency. A common improvement is to acquire a batch of tokens before invoking the composite service and then share them among downstream calls.

1.2.6 Hystrix Rate Limiting

1.2.6.1 Semaphore Limiting

@HystrixCommand(
    commandProperties = {
        @HystrixProperty(name = "execution.isolation.strategy", value = "SEMAPHORE"),
        @HystrixProperty(name = "execution.isolation.semaphore.maxConcurrentRequests", value = "20")
    },
    fallbackMethod = "errMethod"
)

1.2.6.2 Thread‑Pool Limiting

@HystrixCommand(
    commandProperties = {
        @HystrixProperty(name = "execution.isolation.strategy", value = "THREAD")
    },
    threadPoolKey = "createOrderThreadPool",
    threadPoolProperties = {
        @HystrixProperty(name = "coreSize", value = "20"),
        @HystrixProperty(name = "maxQueueSize", value = "100"),
        @HystrixProperty(name = "maximumSize", value = "30"),
        @HystrixProperty(name = "queueSizeRejectionThreshold", value = "120")
    },
    fallbackMethod = "errMethod"
)

❝In Java thread pools, if the number of threads exceeds coreSize , requests go to the queue; when the queue is full, new threads are created up to maximumSize . However, Hystrix adds queueSizeRejectionThreshold . If this threshold is lower than maxQueueSize , the queue will reject requests before reaching maximumSize .❞

2 Circuit Breaking

A circuit breaker acts like a fuse: when failures exceed a threshold, it opens to stop traffic, preventing further damage.

2.1 Circuit‑Breaker States

CLOSED

: normal operation; failure rate below threshold. OPEN: failures exceed threshold; requests are short‑circuited. HALF OPEN: after a timeout, a limited number of requests are allowed to test recovery; success returns to CLOSED, failure goes back to OPEN.

2.2 Considerations

Define different fallback logic for different exception types.

Set a break‑time; after it expires the breaker moves to HALF OPEN for retry.

Log failures for monitoring.

Active retry (e.g., network detection via telnet) before reopening.

Provide a manual compensation interface for operators.

When retrying, ensure idempotency of the original request.

2.3 Use Cases

Service outage or upgrade – fast failure for callers.

Easy definition of failure handling logic.

Long read timeouts that could cause massive retries.

3 Service Degradation

Degradation is a global‑view strategy applied after a circuit opens, routing non‑critical requests to fallback paths.

3.1 Use Cases

Return error directly for non‑critical services.

Cache the request and return an intermediate response, retry later.

Disable non‑core features during traffic spikes.

Serve cached data when DB pressure is high.

Convert heavy write operations to asynchronous processing.

Temporarily stop batch jobs to save resources.

3.2 Hystrix Degradation

3.2.1 Exception Degradation

Use @HystrixCommand with ignoreExceptions to let specific exceptions bypass fallback.

@HystrixCommand(
    fallbackMethod = "errMethod",
    ignoreExceptions = {ParamErrorException.class, BusinessTypeException.class}
)

3.2.2 Timeout Degradation

Define a timeout (e.g., 3000 ms) after which the call falls back.

@HystrixCommand(
    commandProperties = {
        @HystrixProperty(name = "execution.timeout.enabled", value = "true"),
        @HystrixProperty(name = "execution.isolation.thread.timeoutInMilliseconds", value = "3000")
    },
    fallbackMethod = "errMethod"
)

Conclusion

Rate limiting, circuit breaking, and service degradation are essential fault‑tolerance patterns. Rate limiting protects core services from overload, while circuit breaking and degradation safeguard non‑core functionality. Choosing the right algorithm (token bucket is often preferred) and configuring thresholds based on load tests are critical, and these settings should be stored in a configuration center for dynamic updates.

Reference

[1] Microservices: a definition of this new architectural term – https://time.geekbang.org/column/article/312390

[2] Hystrix thread‑pool pitfalls – https://zhuanlan.zhihu.com/p/161522189

[3] CircuitBreaker – https://martinfowler.com/bliki/CircuitBreaker.html

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Microservices fault tolerance service degradation rate limiting circuit breaker Hystrix

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.