Backend Development 16 min read

Understanding Rate Limiting in Distributed Systems: Algorithms and Best Practices

Rate limiting safeguards distributed systems by controlling request rates through algorithms such as leaky bucket, token bucket, fixed and sliding windows, and back pressure, while client‑side tactics like exponential backoff, jitter, and careful retries, and requires atomic distributed storage solutions (e.g., Redis+Lua) to avoid race conditions.

Tencent Cloud Developer
Tencent Cloud Developer
Tencent Cloud Developer
Understanding Rate Limiting in Distributed Systems: Algorithms and Best Practices

Rate limiting (also known as Rate Limiting) is a crucial technique for ensuring the stable operation of modern distributed systems. This article provides a comprehensive overview of rate limiting concepts, algorithms, and implementation strategies.

What is Rate Limiting? Rate limiting is a technique used to restrict the rate of requests, typically to protect services from being overwhelmed by too many requests (whether malicious or unintentional) and to maintain availability.

Why Rate Limiting is Necessary:

Preventing Resource Exhaustion: Protects against malicious attacks (DDoS, brute force) and non-malicious resource consumption due to misconfiguration or misuse.

Managing Quotas: In shared multi-tenant environments, rate limiting prevents the "noisy neighbor" effect where one user degrades service quality for others.

Cost Control: In pay-per-use models, rate limiting helps control operational costs by setting virtual upper bounds on resource scaling.

Rate Limiting Algorithms:

Leaky Bucket: Uses a FIFO queue to smooth bursty traffic into a steady flow. Good for peak shaving but has low resource utilization and starvation issues.

Token Bucket: Allows burst traffic while maintaining average rate limits. Uses a bucket that fills with tokens at a constant rate; requests consume tokens.

Simple Counting: Simple counter-based approach for resource pool scenarios like thread pools or connection pools.

Fixed Window Counter: Uses fixed time windows but suffers from the "boundary double hit" problem where requests can spike at window reset points.

Sliding Logs: Uses a rolling window to precisely track request timestamps, eliminating fixed window boundaries but consuming significant storage.

Sliding Window Counter: Combines fixed window efficiency with sliding log accuracy by merging logs with the same timestamp.

Back Pressure: A cooperative strategy where servers signal clients to slow down (e.g., HTTP 429), and clients must respond appropriately.

Client-Side Strategies:

Timeout and Retry: Handle the "third state" (unknown result) in distributed systems.

Backoff: Wait progressively longer between retries. Exponential backoff is recommended: timeout = min(base * pow(2, attempt), threshold)

Jitter: Add random delays to prevent thundering herd effects when multiple clients retry simultaneously.

Careful Retry: Only retry when dependencies are healthy to avoid worsening overload situations.

Distributed Rate Limiting: In distributed systems, rate limiting across all service instances requires centralized storage (e.g., Redis) to track limiting counters. Key challenges include race conditions when using simple "get-then-set" operations. Solutions include: relaxing limits, session persistence, locking, and Redis+Lua for atomic operations:

local curr_count = tonumber(redis.call('GET', key) or "0")
if curr_count + 1 > limit then
    return 0
else
    redis.call("INCRBY", key, 1)
    redis.call("EXPIRE", key, 2)
    return 1
end
distributed systemsAlgorithmsystem designRate Limitingtoken bucketback pressureleaky bucket
Tencent Cloud Developer
Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.