Mastering Rate Limiting: Token Bucket & Sliding Window Algorithms in Java
This article explains the principles and implementation details of common rate‑limiting algorithms—token bucket and sliding‑window counting—including their core concepts, key processes, Java code examples, and how to extend them to distributed scenarios with Redis Lua scripts.
Introduction
Rate limiting is a fundamental technique for protecting services from overload and ensuring fair resource usage. The article covers two widely used algorithms—token bucket and sliding‑window counting—detailing their concepts, characteristics, key processing steps, and practical Java implementations. It also shows how to adapt these algorithms for distributed environments using Redis Lua scripts.
Basic Interface
tryAcquire : non‑blocking attempt that returns true if a permit is granted, false otherwise.
acquire : blocking call that waits (optionally with a timeout) until a permit becomes available.
Token Bucket Algorithm
Concept
Token: a permit representing a unit of allowed QPS or resource consumption.
Bucket: holds a maximum number of tokens, defining the capacity.
Refill rate: how many tokens are added per second.
Available tokens never exceed the bucket size.
Characteristics
Works for both bursty and steady traffic.
Implementation is relatively simple compared to other algorithms.
Key Process
The bucket is refilled lazily: each call recomputes the number of tokens that should have been added since the last refill.
tryAcquire :
Recalculate available tokens.
If enough tokens exist, deduct the requested amount and return success.
Otherwise return failure.
acquire :
If permits ≤ available tokens, deduct and return.
Otherwise compute wait time as (permits - tokens) * refillIntervalNanos, then block using Condition.awaitNanos until tokens are replenished.
Critical Refill Logic
Tokens to add = (now - lastRefillTime) / refillIntervalNanos.
Update tokens = min(capacity, tokens + tokensToAdd) and set lastRefillTime = now.
If tokens become available, wake waiting threads with condition.signalAll().
Java Implementation
import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.Condition;
import java.util.concurrent.locks.ReentrantLock;
/**
* Token‑bucket rate limiter – supports non‑blocking and blocking token acquisition.
*/
public class TokenBucketRateLimiter {
private final long capacity; // bucket size
private final double refillRate; // tokens per second
private final long refillIntervalNanos; // nanoseconds per token
private double tokens; // current available tokens
private long lastRefillTime; // last refill timestamp
private final ReentrantLock lock;
private final Condition condition;
public TokenBucketRateLimiter(long capacity, double refillRate) {
if (capacity <= 0 || refillRate <= 0) {
throw new IllegalArgumentException("capacity and refillRate must be > 0");
}
this.capacity = capacity;
this.refillRate = refillRate;
this.refillIntervalNanos = (long) (1_000_000_000.0 / refillRate);
this.tokens = capacity;
this.lastRefillTime = System.nanoTime();
this.lock = new ReentrantLock();
this.condition = lock.newCondition();
}
public boolean tryAcquire() {
return tryAcquire(1);
}
public boolean tryAcquire(long permits) {
if (permits <= 0) {
throw new IllegalArgumentException("permits must be > 0");
}
if (permits > capacity) {
return false; // exceeds bucket capacity
}
lock.lock();
try {
refillTokens();
if (tokens >= permits) {
tokens -= permits;
return true;
}
return false;
} finally {
lock.unlock();
}
}
public void acquire() throws InterruptedException {
acquire(1);
}
public void acquire(long permits) throws InterruptedException {
if (permits <= 0) {
throw new IllegalArgumentException("permits must be > 0");
}
if (permits > capacity) {
throw new IllegalArgumentException("permits exceed bucket capacity");
}
lock.lock();
try {
while (tokens < permits) {
double needed = permits - tokens;
long waitNanos = (long) (needed * refillIntervalNanos);
condition.awaitNanos(waitNanos);
refillTokens();
}
tokens -= permits;
} finally {
lock.unlock();
}
}
private void refillTokens() {
long now = System.nanoTime();
double toAdd = (now - lastRefillTime) / (double) refillIntervalNanos;
if (toAdd > 0) {
tokens = Math.min(capacity, tokens + toAdd);
lastRefillTime = now;
if (tokens > 0) {
condition.signalAll();
}
}
}
}Sliding Window Counting Algorithm
Concept
Time window: a configurable duration (seconds, minutes, etc.) within which a maximum number of requests is allowed.
Slots (or cells): the window is divided into finer‑grained intervals (e.g., 10 slots of 100 ms each).
Each slot records the request count for its sub‑interval.
The total count is the sum of all slot counts.
Characteristics
Provides smoother throttling by preventing traffic spikes.
Implementation is more complex because slot counts must be updated and expired.
Key Process (Sliding Window)
tryAcquire :
Clear expired slots (reset count and update start time).
Check if totalRequests + permits ≤ maxRequests.
If allowed, increment the current slot and total count.
acquire (blocking):
Repeatedly attempt tryAcquireInternal.
If not enough tokens, compute wait time until the earliest non‑empty slot expires.
Block with Condition.await (or timed await) and retry.
Java Implementation
import java.util.concurrent.TimeUnit;
import java.util.concurrent.locks.Condition;
import java.util.concurrent.locks.ReentrantLock;
import java.util.concurrent.atomic.AtomicInteger;
/**
* Sliding‑window rate limiter – supports non‑blocking and blocking acquisition.
*/
public class SlidingWindowRateLimiter {
private final long windowSizeMs; // total window length
private final int maxRequests; // allowed requests per window
private final int slotCount; // number of slots
private final long slotSizeMs; // slot duration
private final AtomicInteger[] slots; // per‑slot counters
private final long[] slotStartTimes; // slot start timestamps
private final AtomicInteger totalRequests; // sum of all slots
private final ReentrantLock lock;
private final Condition condition;
public SlidingWindowRateLimiter(int maxRequests, long windowSizeMs, int slotCount) {
if (maxRequests <= 0 || windowSizeMs <= 0 || slotCount <= 0) {
throw new IllegalArgumentException("arguments must be > 0");
}
this.maxRequests = maxRequests;
this.windowSizeMs = windowSizeMs;
this.slotCount = slotCount;
this.slotSizeMs = windowSizeMs / slotCount;
this.slots = new AtomicInteger[slotCount];
this.slotStartTimes = new long[slotCount];
this.totalRequests = new AtomicInteger(0);
long now = System.currentTimeMillis();
for (int i = 0; i < slotCount; i++) {
slots[i] = new AtomicInteger(0);
slotStartTimes[i] = now - (slotCount - i) * slotSizeMs;
}
this.lock = new ReentrantLock();
this.condition = lock.newCondition();
}
public boolean tryAcquire() {
return tryAcquire(1);
}
public boolean tryAcquire(int permits) {
if (permits <= 0) {
throw new IllegalArgumentException("permits must be > 0");
}
if (permits > maxRequests) {
return false;
}
lock.lock();
try {
slideWindow();
if (totalRequests.get() + permits <= maxRequests) {
long now = System.currentTimeMillis();
int idx = getCurrentSlotIndex(now);
slots[idx].addAndGet(permits);
totalRequests.addAndGet(permits);
return true;
}
return false;
} finally {
lock.unlock();
}
}
public void acquire() throws InterruptedException {
acquire(1);
}
public void acquire(int permits) throws InterruptedException {
if (permits <= 0) {
throw new IllegalArgumentException("permits must be > 0");
}
if (permits > maxRequests) {
throw new IllegalArgumentException("permits exceed window capacity");
}
lock.lock();
try {
while (!tryAcquireInternal(permits)) {
long waitMs = calculateWaitTime(permits);
if (waitMs > 0) {
condition.await(waitMs, TimeUnit.MILLISECONDS);
} else {
condition.await();
}
}
} finally {
lock.unlock();
}
}
private boolean tryAcquireInternal(int permits) {
slideWindow();
if (totalRequests.get() + permits <= maxRequests) {
long now = System.currentTimeMillis();
int idx = getCurrentSlotIndex(now);
slots[idx].addAndGet(permits);
totalRequests.addAndGet(permits);
return true;
}
return false;
}
private void slideWindow() {
long now = System.currentTimeMillis();
long windowStart = now - windowSizeMs;
for (int i = 0; i < slotCount; i++) {
if (slotStartTimes[i] < windowStart) {
int expired = slots[i].getAndSet(0);
if (expired > 0) {
totalRequests.addAndGet(-expired);
slotStartTimes[i] = now - (slotCount - i - 1) * slotSizeMs;
}
}
}
if (totalRequests.get() < maxRequests) {
condition.signalAll();
}
}
private int getCurrentSlotIndex(long currentTime) {
return (int) ((currentTime / slotSizeMs) % slotCount);
}
private long calculateWaitTime(int permits) {
long now = System.currentTimeMillis();
long earliest = Long.MAX_VALUE;
for (int i = 0; i < slotCount; i++) {
if (slots[i].get() > 0 && slotStartTimes[i] < earliest) {
earliest = slotStartTimes[i];
}
}
if (earliest == Long.MAX_VALUE) {
return 0;
}
long earliestExpire = earliest + windowSizeMs;
long wait = earliestExpire - now;
int available = maxRequests - totalRequests.get();
if (available <= 0) {
wait = Math.max(wait, slotSizeMs);
}
return Math.max(0, wait);
}
public int getCurrentRequests() {
lock.lock();
try {
slideWindow();
return totalRequests.get();
} finally {
lock.unlock();
}
}
}Distributed Rate Limiting with Redis
Both algorithms can be adapted for a distributed environment by storing counters in Redis and using Lua scripts to guarantee atomicity.
Fixed‑Window Counter via Lua
-- Fixed‑window rate‑limiting script
-- KEYS[1]: limit key
-- ARGV[1]: window size (seconds)
-- ARGV[2]: max allowed requests
local key = KEYS[1]
local window = tonumber(ARGV[1])
local maxCount = tonumber(ARGV[2])
local current = redis.call('GET', key)
if current == false then
-- key does not exist, create with initial count 1 and set TTL
redis.call('SET', key, 1, 'EX', window)
return 1
else
local count = tonumber(current)
if count < maxCount then
redis.call('INCR', key)
return 1
else
return 0
end
endConclusion
Understanding the mechanics of token‑bucket and sliding‑window rate‑limiting algorithms enables developers to choose the most suitable strategy for their services. Token bucket excels at handling bursts and is widely adopted (e.g., Guava RateLimiter, Sentinel, Redisson). Sliding‑window provides stricter smoothing, while distributed implementations using Redis Lua scripts extend these concepts across multiple nodes.
Java Baker
Java architect and Raspberry Pi enthusiast, dedicated to writing high-quality technical articles; the same name is used across major platforms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
