Backend Development 13 min read

Understanding Rate Limiting: Counter, Leaky Bucket, and Token Bucket Algorithms

This article explains the fundamentals of rate limiting for high‑concurrency systems, covering why it is needed, the conditions that trigger it, and detailed introductions to the three main algorithms—counter (fixed and sliding windows), leaky bucket, and token bucket—along with their advantages, drawbacks, and sample pseudo‑code implementations.

JD Retail Technology
JD Retail Technology
JD Retail Technology
Understanding Rate Limiting: Counter, Leaky Bucket, and Token Bucket Algorithms

High‑concurrency systems rely on three key tools—caching, rate limiting, and degradation—to stay stable; this article focuses on rate limiting, introducing its basic concepts and why it is essential for protecting services from overload.

Rate limiting restricts incoming traffic to keep request volume within a system's capacity, preventing scenarios such as bandwidth saturation (e.g., a 1 Gbps link overwhelmed by traffic) and ensuring smooth operation in contexts like banking queues, restaurant waitlists, or pandemic‑related medical systems.

Typical triggers for rate limiting include rapid user growth, sudden traffic spikes, aggressive crawlers, fraudulent order‑spamming, and attacks.

The three most common rate‑limiting algorithms are:

Counter algorithm (fixed‑window and sliding‑window variants)

Leaky bucket algorithm

Token bucket algorithm

Each algorithm suits different scenarios, and the following sections detail their principles, pros and cons, and provide pseudo‑code examples.

Counter Algorithm

The counter algorithm counts requests within a time unit; if the count exceeds a predefined threshold, requests are blocked. It has two implementations:

Fixed‑window: each time slot is independent; when a slot expires, counting restarts in the next slot.

Sliding‑window: the time window moves continuously, overlapping with previous windows, which mitigates the “burst” problem of fixed windows.

Fixed‑window logic diagram:

Fixed‑window suffers from the “critical” problem where requests may cluster at the edge of a window, causing a temporary overload that exceeds the system’s QPS limit.

Fixed‑window pseudo‑code:

int
unitTime =
1
s
// set unit time to 1 second
int
limitCount =
1000
// allow up to 1000 requests per unit time
string
limitKey =
'limitkey'
;
// key for storing rate‑limit state
if (!has(limitKey)) {
// initialize state with expiration equal to unitTime (e.g., Redis SET)
set(limitKey, 0, unitTime);
}
// atomically increment request count
int
counter = incr(limitKey, 1);
if (counter > limitCount) {
// exceed limit, return 503
return 503;
} else {
continue; // proceed with business logic
}

Sliding‑window refines the fixed‑window by dividing the unit time into smaller granules (e.g., 10 × 100 ms). The window slides forward with each request, so the counted interval continuously overlaps, reducing burst effects.

Sliding‑window incurs higher computational overhead because it must count requests over a moving interval.

Sliding‑window pseudo‑code:

int
unitTime =
1000
ms
// 1 second
int
limitCount =
1000
// max requests per unit time
string
listKey =
'limitkey'
;
// key for sorted set
// current timestamp in ms
int
curMilliseconds = nowMilliseconds();
// start of the sliding window
int
startMilliseconds = curMilliseconds - unitTime *
1000
;
// count requests in the window (e.g., Redis ZCOUNT)
int
counter = ZCOUNT(listKey, startMilliseconds, curMilliseconds);
if (counter > limitCount) {
return 503;
} else {
ZADD(listKey, curMilliseconds, uniqueId);
continue;
}

Leaky Bucket Algorithm

The leaky bucket treats incoming requests as water droplets poured into a fixed‑size FIFO queue (the bucket). The bucket drains at a constant rate; excess droplets are discarded, providing a smooth, average processing rate.

Leaky bucket pseudo‑code:

// pseudo‑code implementation
local rate =
1
ms
// processing rate (1 request per ms)
local bucketSize =
1000
// max number of droplets
local bucketQueue =
new
BucketQueue(bucketSize);
local userRequest =
new
UserRequest();
local res = bucketQueue.push(userRequest);
if (res ==
false
) {
return 503; // bucket full
} else {
// request will be processed later
}
// consume the queue at fixed rate
setTimeout(function () {
local userRequest = bucketQueue.lpop();
// execute business logic
}, rate);

Token Bucket Algorithm

The token bucket is similar to the leaky bucket but separates token production from token consumption. Tokens are generated at a steady rate and stored in a bucket; a request proceeds only if it can acquire a token, ensuring a controlled request rate while allowing bursts up to the bucket capacity.

Token bucket pseudo‑code:

// pseudo‑code implementation
local tokenProducerRate =
1
ms
// token generation rate
local bucketSize =
1000
// max tokens
local bucketKey =
"bucketKey"
;
bucketKey.setSize(bucketSize);
local token = bucketKey.getValidToken();
if (token ~=
false
) {
token.lock(); // ensure exclusive use
} else {
return 503; // no token available
}
// continue business logic
token.destroy(); // release token after use
// token producer (adds a token every 1 ms)
setTimeout(function () {
local token =
new
Token();
local result = bucketKey.push(token);
// ignore push result; if bucket full, token is discarded
}, tokenProducerRate);

Algorithm Comparison

Each algorithm has its own strengths and trade‑offs; selecting the appropriate one depends on the specific business scenario, and hybrid approaches can be employed to satisfy complex requirements.

Note: This article focuses on the theoretical foundations and implementation sketches; future posts will cover practical rate‑limiting configurations in Nginx and Google Guava.

backendRate Limitingtoken bucketleaky bucketcounter algorithm
JD Retail Technology
Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.