Choosing the Right Rate‑Limiting Algorithm: Simple Window, Sliding Window, Leaky Bucket, Token Bucket & Sliding Log

This article explains the purpose of flow control, compares various rate‑limiting algorithms—including simple window, sliding window, leaky bucket, token bucket, and sliding log—provides Java interface definitions and code examples, discusses their complexity, precision, smoothness, and suitability for single‑machine and distributed scenarios, and offers practical deployment tips using Sentinel, Nginx, Guava, Tair, and Redis.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
Choosing the Right Rate‑Limiting Algorithm: Simple Window, Sliding Window, Leaky Bucket, Token Bucket & Sliding Log

Flow‑Control Scenarios

Flow control protects downstream resources from being overwhelmed and can also enforce charging models; different scenarios require different algorithms, and while Sentinel often suffices, other solutions may be needed.

Interface Definition

All examples use a simple Throttler interface with a method to try acquiring a single permit:

public interface Throttler {
    /**
     * Try to acquire a permit
     * @param key the key for the request
     * @return true if acquisition succeeds, false otherwise
     */
    boolean tryAcquire(String key);
}

Single‑Machine Flow Control

1. Simple Window

A fixed time window counts requests; if the count is below the threshold the request is allowed, otherwise it is rejected. When the window expires the counter resets.

private final long windowInMs;
private final int threshold;
private long lastReqTime = System.currentTimeMillis();
private long counter;

public boolean tryAcquire(String key) {
    long now = System.currentTimeMillis();
    if (now - lastReqTime > windowInMs) {
        counter = 0;
        lastReqTime = now;
    }
    if (counter < threshold) {
        counter++;
        return true;
    } else {
        return false;
    }
}

In multithreaded environments the method can be made synchronized or use volatile and LongAdder, though race conditions may still occur.

2. Sliding Window

The large window is divided into sub‑windows; each sub‑window maintains its own counter. The total of the recent sub‑windows determines whether the request is allowed, eliminating the critical‑point burst of the simple window.

Sentinel’s implementation uses LeapArray to store sub‑window data (start time, length, and a MetricBucket with LongAdder counters).

3. Leaky Bucket

The bucket holds a limited amount of water (requests). Water leaks at a constant rate; excess water overflows and is rejected, providing smooth output even if input bursts.

private long left;
private long lastInjectTime = System.currentTimeMillis();
private long capacity;
private long duration;
private double velocity; // capacity / duration

public boolean tryAcquire(String key) {
    long now = System.currentTimeMillis();
    left = Math.max(0, left - (long)((now - lastInjectTime) * velocity));
    if (left + 1 <= capacity) {
        lastInjectTime = now;
        left++;
        return true;
    } else {
        return false;
    }
}

4. Token Bucket

Tokens are added at a fixed rate up to a maximum capacity; each request consumes a token. It is the inverse of the leaky bucket and can handle bursts better.

long now = System.currentTimeMillis();
left = Math.min(capacity, left + (long)((now - lastInjectTime) * velocity));
if (left - 1 > 0) {
    lastInjectTime = now;
    left--;
    return true;
} else {
    return false;
}

Guava’s RateLimiter offers a production‑ready, thread‑safe token bucket implementation with optional warm‑up.

5. Sliding Log

Every request timestamp is recorded (e.g., in a queue or Redis Sorted Set). To decide, the system counts timestamps within the last T interval. This provides exact control at the cost of O(N) space.

# Initialize
counter = 0
q = []
# Request handling
t = now
start = findWindowStart(q, t)
q = q[start:]
counter -= start
if counter < threshold:
    push(q, t)
    counter += 1
    # allow
else:
    # reject

Distributed Flow Control

When services are deployed across multiple nodes, a central or decentralized coordination mechanism is required.

Centralized Approach

A single TokenServer holds the global quota; each instance requests permits from it. Simplicity comes at the risk of a single point of failure.

Decentralized Approach

Each instance maintains local state and periodically syncs with peers, reducing single‑point risk but increasing complexity and consistency challenges.

Practical Deployments

Entry‑layer throttling : Nginx ngx_http_limit_req_module (leaky bucket) or custom Lua modules.

TokenServer : Sentinel’s cluster mode (requires custom high‑availability handling).

Storage‑based throttling : Use Tair or Redis to store counters.

Simple window with INCR and EXPIRE (or Lua script for atomicity).

Token/Leaky bucket using hash fields for tokens and timestamps.

Sliding log using Redis Sorted Set ( ZADD, ZREMRANGEBYSCORE, ZCOUNT).

Complexity and Trade‑offs

Simple window: O(1) time, O(1) space, but suffers from critical‑point bursts. Sliding window: O(1) time, O(k) space (k sub‑windows). Sliding log: O(log N) time for window start lookup, O(N) space. Token/Leaky bucket: O(1) time, O(1) space, but precision depends on bucket granularity.

Best Practices

Combine multiple layers (e.g., Nginx + application‑level) to filter traffic early.

Store dynamic throttling state in a centralized cache (Redis, Tair) consistent with company architecture.

Keep static configuration in a config center (e.g., Diamond).

Design fallback to single‑machine throttling when the distributed system is unavailable.

Pre‑allocate a portion of quota to each node to reduce cache hits; request additional quota on demand.

Conclusion

Distributed flow‑control algorithms are extensions of single‑machine techniques; choosing the right algorithm depends on required precision, smoothness, and system complexity. The article summarizes the core ideas, code sketches, and operational considerations for simple window, sliding window, leaky bucket, token bucket, and sliding log implementations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed Systemsjavaalgorithmredisrate limitingthrottling
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.