Rate Limiting Algorithms and Their Java Implementations
Rate limiting protects system stability by capping request rates, and this article explains five Java algorithms—Fixed Window, Sliding Window, Leaky Bucket, Token Bucket, and Guava's RateLimiter—showing their principles, pros and cons, and providing sample implementations and a Spring @Limit annotation for practical enforcement.
Rate limiting is a crucial technique to ensure system resilience by preventing overload caused by sudden traffic spikes. It may discard some user requests, but this is generally acceptable compared to system crashes.
1. Fixed Window Algorithm
Implementation principle: count the number of requests within a fixed time window and reset the counter when the window expires. For example, limit 150 requests in a 3‑second window.
public class FixedWindowRateLimiter {
Logger logger = LoggerFactory.getLogger(FixedWindowRateLimiter.class);
// time window size in ms
long windowSize;
// allowed request count
int maxRequestCount;
// current window request count
AtomicInteger counter = new AtomicInteger(0);
// window right border
long windowBorder;
public FixedWindowRateLimiter(long windowSize, int maxRequestCount) {
this.windowSize = windowSize;
this.maxRequestCount = maxRequestCount;
this.windowBorder = System.currentTimeMillis() + windowSize;
}
public synchronized boolean tryAcquire() {
long currentTime = System.currentTimeMillis();
if (windowBorder < currentTime) {
logger.info("window reset");
do {
windowBorder += windowSize;
} while (windowBorder < currentTime);
counter = new AtomicInteger(0);
}
if (counter.intValue() < maxRequestCount) {
counter.incrementAndGet();
logger.info("tryAcquire success");
return true;
} else {
logger.info("tryAcquire fail");
return false;
}
}
}Advantages: simple to implement and easy to understand.
Disadvantages: not smooth (e.g., if three requests are sent in the first millisecond, the rest of the window is blocked) and suffers from window‑boundary effects that can allow a burst of requests at the edge of two windows.
2. Sliding Window Algorithm
Improves the fixed window by dividing the window into multiple smaller sub‑windows that slide forward. This solves the boundary problem while keeping the total window size fixed.
public class SlidingWindowRateLimiter {
Logger logger = LoggerFactory.getLogger(FixedWindowRateLimiter.class);
// time window size in ms
long windowSize;
// number of sub‑windows
int shardNum;
// allowed request count
int maxRequestCount;
// request count per sub‑window
int[] shardRequestCount;
// total request count in the current window
int totalCount;
// current sub‑window index
int shardId;
// size of each sub‑window in ms
long tinyWindowSize;
// right border of the window
long windowBorder;
public SlidingWindowRateLimiter(long windowSize, int shardNum, int maxRequestCount) {
this.windowSize = windowSize;
this.shardNum = shardNum;
this.maxRequestCount = maxRequestCount;
this.shardRequestCount = new int[shardNum];
this.tinyWindowSize = windowSize / shardNum;
this.windowBorder = System.currentTimeMillis();
}
public synchronized boolean tryAcquire() {
long currentTime = System.currentTimeMillis();
if (windowBorder < currentTime) {
logger.info("window reset");
do {
shardId = (++shardId) % shardNum;
totalCount -= shardRequestCount[shardId];
shardRequestCount[shardId] = 0;
windowBorder += tinyWindowSize;
} while (windowBorder < currentTime);
}
if (totalCount < maxRequestCount) {
logger.info("tryAcquire success:{}", shardId);
shardRequestCount[shardId]++;
totalCount++;
return true;
} else {
logger.info("tryAcquire fail");
return false;
}
}
}Advantages: eliminates the window‑boundary issue and prevents burst overload.
Disadvantages: still suffers from non‑smooth throttling (e.g., a burst of three requests in the first millisecond can block the rest of the window).
3. Leaky Bucket Algorithm
The leaky bucket is a traffic‑shaping algorithm that controls the outflow rate of requests, smoothing bursts and preventing overload.
public class LeakyBucketRateLimiter {
Logger logger = LoggerFactory.getLogger(LeakyBucketRateLimiter.class);
// bucket capacity
int capacity;
// current water amount
AtomicInteger water = new AtomicInteger();
// timestamp of the last leak
long leakTimestamp;
// leak rate (requests per second)
int leakRate;
public LeakyBucketRateLimiter(int capacity, int leakRate) {
this.capacity = capacity;
this.leakRate = leakRate;
}
public synchronized boolean tryAcquire() {
if (water.get() == 0) {
logger.info("start leaking");
leakTimestamp = System.currentTimeMillis();
water.incrementAndGet();
return water.get() < capacity;
}
long currentTime = System.currentTimeMillis();
int leakedWater = (int) ((currentTime - leakTimestamp) / 1000 * leakRate);
logger.info("lastTime:{}, currentTime:{}, LeakedWater:{}", leakTimestamp, currentTime, leakedWater);
if (leakedWater != 0) {
int leftWater = water.get() - leakedWater;
water.set(Math.max(0, leftWater));
leakTimestamp = System.currentTimeMillis();
}
logger.info("remaining capacity:{}", capacity - water.get());
if (water.get() < capacity) {
logger.info("tryAcquire success");
water.incrementAndGet();
return true;
} else {
logger.info("tryAcquire fail");
return false;
}
}
}Advantages: smooths traffic, prevents overload, and discards excess requests when the bucket is full.
Disadvantages: cannot handle sudden bursts well, may lead to data loss, and can waste resources when the system is idle.
4. Token Bucket Algorithm
The token bucket improves the leaky bucket by allowing bursts while still limiting the average rate. Tokens are added to the bucket at a fixed rate; a request consumes a token, and if none are available the request is rejected.
Guava’s RateLimiter is an implementation of the token bucket and can be used directly.
5. Practical Application with Guava RateLimiter
In a Spring environment, a custom @Limit annotation together with an AOP aspect can enforce method‑level rate limiting.
@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.METHOD})
@Documented
public @interface Limit {
// resource key
String key() default "";
// permits per second
double permitsPerSeconds();
// timeout for acquiring a permit
long timeout();
// time unit of the timeout
TimeUnit timeUnit() default TimeUnit.MILLISECONDS;
// error message
String msg() default "系统繁忙,请稍后重试";
} @Aspect
@Component
public class LimitAspect {
Logger logger = LoggerFactory.getLogger(LimitAspect.class);
private final Map
limitMap = Maps.newConcurrentMap();
@Around("@annotation(com.example.annotation.Limit)")
public Object around(ProceedingJoinPoint joinPoint) throws Throwable {
MethodSignature signature = (MethodSignature) joinPoint.getSignature();
Method method = signature.getMethod();
Limit limit = method.getAnnotation(Limit.class);
if (limit != null) {
String key = limit.key();
RateLimiter rateLimiter = limitMap.computeIfAbsent(key, k -> {
RateLimiter rl = RateLimiter.create(limit.permitsPerSeconds());
logger.info("Created token bucket={}, capacity={}", k, limit.permitsPerSeconds());
return rl;
});
boolean acquired = rateLimiter.tryAcquire(limit.timeout(), limit.timeUnit());
if (!acquired) {
logger.debug("Token bucket={}, acquire failed", key);
throw new RuntimeException(limit.msg());
}
}
return joinPoint.proceed();
}
}Example usage:
@Limit(key = "query", permitsPerSeconds = 1, timeout = 1, msg = "触发接口限流,请重试")
public Result query(...) { ... }Test results show that Guava’s RateLimiter generates tokens at a steady rate (e.g., 5 tokens per second) and spreads requests evenly, making it a powerful tool for single‑node rate limiting.
For multi‑node deployments, distributed rate limiting or gateway‑level limiting is required.
In summary, rate limiting is indispensable for system robustness. It can be combined with circuit breaking and degradation strategies to ensure high availability.
Amap Tech
Official Amap technology account showcasing all of Amap's technical innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.