Mastering Rate Limiting: Algorithms, Java Implementations, and Guava Tips
This article explains why rate limiting is essential for high‑traffic systems, defines common rate‑limiting algorithms (fixed window, sliding window, leaky bucket, token bucket), provides complete Java code examples for each, and demonstrates practical usage with Guava's RateLimiter in real‑world applications.
1. Rate Limiting
Rate limiting restricts the number of requests processed within a given time window to protect system stability and prevent overload caused by traffic spikes or malicious users.
Why Apply Rate Limiting?
Instant traffic spikes can crash services.
Malicious high‑frequency requests may cause server failure.
Fast message consumption can overload databases.
What Is Rate Limiting?
It caps the request count in a specific time window, ensuring availability and preventing slowdowns or crashes in high‑concurrency scenarios.
2. Rate‑Limiting Algorithms
2.1 Fixed Window
The Fixed Window (or Counter) algorithm counts requests in a fixed period and blocks further requests once a threshold is reached. When the window expires, the counter resets.
public class FixedWindowRateLimiter {
Logger logger = LoggerFactory.getLogger(FixedWindowRateLimiter.class);
// window size in ms
long windowSize;
// max allowed requests per window
int maxRequestCount;
// current count
AtomicInteger counter = new AtomicInteger(0);
// window end timestamp
long windowBorder;
public FixedWindowRateLimiter(long windowSize, int maxRequestCount) {
this.windowSize = windowSize;
this.maxRequestCount = maxRequestCount;
this.windowBorder = System.currentTimeMillis() + windowSize;
}
public synchronized boolean tryAcquire() {
long currentTime = System.currentTimeMillis();
if (windowBorder < currentTime) {
logger.info("window reset");
do {
windowBorder += windowSize;
} while (windowBorder < currentTime);
counter = new AtomicInteger(0);
}
if (counter.intValue() < maxRequestCount) {
counter.incrementAndGet();
logger.info("tryAcquire success");
return true;
} else {
logger.info("tryAcquire fail");
return false;
}
}
}Advantages: Simple to implement and easy to understand.
Disadvantages: Not smooth; requests arriving early in the window can cause all later requests to be rejected, and window boundary effects may allow a burst that exceeds the intended limit.
2.2 Sliding Window
The Sliding Window refines the Fixed Window by dividing the window into smaller sub‑windows, moving forward as time progresses. This smooths traffic and mitigates boundary spikes, though it requires storing timestamps for each request.
public class SlidingWindowRateLimiter {
Logger logger = LoggerFactory.getLogger(FixedWindowRateLimiter.class);
// total window size in ms
long windowSize;
// number of sub‑windows
int shardNum;
// max allowed requests per total window
int maxRequestCount;
// request count per sub‑window
int[] shardRequestCount;
// total requests in current window
int totalCount;
// current sub‑window index
int shardId;
// size of each sub‑window in ms
long tinyWindowSize;
// window end timestamp
long windowBorder;
public SlidingWindowRateLimiter(long windowSize, int shardNum, int maxRequestCount) {
this.windowSize = windowSize;
this.shardNum = shardNum;
this.maxRequestCount = maxRequestCount;
this.shardRequestCount = new int[shardNum];
this.tinyWindowSize = windowSize / shardNum;
this.windowBorder = System.currentTimeMillis();
}
public synchronized boolean tryAcquire() {
long currentTime = System.currentTimeMillis();
if (windowBorder < currentTime) {
logger.info("window reset");
do {
shardId = (++shardId) % shardNum;
totalCount -= shardRequestCount[shardId];
shardRequestCount[shardId] = 0;
windowBorder += tinyWindowSize;
} while (windowBorder < currentTime);
}
if (totalCount < maxRequestCount) {
logger.info("tryAcquire success:{}", shardId);
shardRequestCount[shardId]++;
totalCount++;
return true;
} else {
logger.info("tryAcquire fail");
return false;
}
}
}Advantages: Solves the boundary problem of Fixed Window and handles burst traffic.
Disadvantages: Still not perfectly smooth; a burst can still cause temporary overload.
2.3 Leaky Bucket
The Leaky Bucket shapes traffic by allowing requests to “leak” at a constant rate. Excess requests overflow and are dropped, providing smooth output but potentially discarding data under heavy load.
public class LeakyBucketRateLimiter {
Logger logger = LoggerFactory.getLogger(LeakyBucketRateLimiter.class);
// bucket capacity
int capacity;
// current water level
AtomicInteger water = new AtomicInteger();
// timestamp of last leak
long leakTimestamp;
// leak rate (requests per second)
int leakRate;
public LeakyBucketRateLimiter(int capacity, int leakRate) {
this.capacity = capacity;
this.leakRate = leakRate;
}
public synchronized boolean tryAcquire() {
if (water.get() == 0) {
logger.info("start leaking");
leakTimestamp = System.currentTimeMillis();
water.incrementAndGet();
return water.get() < capacity;
}
long currentTime = System.currentTimeMillis();
int leakedWater = (int) ((currentTime - leakTimestamp) / 1000 * leakRate);
logger.info("lastTime:{}, currentTime:{}, leakedWater:{}", leakTimestamp, currentTime, leakedWater);
if (leakedWater != 0) {
int leftWater = water.get() - leakedWater;
water.set(Math.max(0, leftWater));
leakTimestamp = System.currentTimeMillis();
}
logger.info("remaining capacity:{}", capacity - water.get());
if (water.get() < capacity) {
logger.info("tryAcquire success");
water.incrementAndGet();
return true;
} else {
logger.info("tryAcquire fail");
return false;
}
}
}Advantages: Smooths traffic and prevents overload by discarding excess requests.
Disadvantages: Cannot handle sudden bursts efficiently and may waste resources when the system is idle.
2.4 Token Bucket
The Token Bucket improves on the Leaky Bucket by allowing tokens to accumulate at a fixed rate; requests consume tokens, enabling occasional bursts while still enforcing an average rate.
// Guava RateLimiter is a ready‑made token‑bucket implementation
RateLimiter limiter = RateLimiter.create(5); // 5 permits per second
for (int i = 0; i < 10; i++) {
double wait = limiter.acquire();
logger.info("wait time: {}s", wait);
}3. Practical Application with Guava
Guava's RateLimiter provides a token‑bucket based rate limiter that can be directly used in Java projects. Add the dependency:
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>32.1.3-jre</version>
</dependency>Example of fixed token generation:
@Test
public void acquireTest() {
// generate 5 tokens per second
RateLimiter rateLimiter = RateLimiter.create(5);
for (int i = 0; i < 10; i++) {
double time = rateLimiter.acquire();
logger.info("wait time: {}s", time);
}
}Example of handling bursts with warm‑up period:
@Test
public void acquireSmoothly() {
RateLimiter rateLimiter = RateLimiter.create(5, 3, TimeUnit.SECONDS);
long start = System.currentTimeMillis();
for (int i = 0; i < 15; i++) {
double time = rateLimiter.acquire();
logger.info("wait: {}s, total: {}ms", time, System.currentTimeMillis() - start);
}
}4. AOP‑Based Rate Limiting
Define a custom annotation @Limit to specify key, permits per second, timeout, and message. Implement an Aspect that creates a Guava RateLimiter per key and applies tryAcquire before method execution.
@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
@Documented
public @interface Limit {
String key() default "";
double permitsPerSeconds();
long timeout();
TimeUnit timeUnit() default TimeUnit.MILLISECONDS;
String msg() default "System busy, please try again later";
} @Aspect
@Component
public class LimitAspect {
Logger logger = LoggerFactory.getLogger(LimitAspect.class);
private final Map<String, RateLimiter> limitMap = Maps.newConcurrentMap();
@Around("@annotation(com.example.annotation.Limit)")
public Object around(ProceedingJoinPoint joinPoint) throws Throwable {
Method method = ((MethodSignature) joinPoint.getSignature()).getMethod();
Limit limit = method.getAnnotation(Limit.class);
if (limit != null) {
String key = limit.key();
RateLimiter rateLimiter = limitMap.computeIfAbsent(key, k -> {
RateLimiter rl = RateLimiter.create(limit.permitsPerSeconds());
logger.info("Created token bucket={}, capacity={}", k, limit.permitsPerSeconds());
return rl;
});
boolean acquire = rateLimiter.tryAcquire(limit.timeout(), limit.timeUnit());
if (!acquire) {
logger.debug("Token bucket={}, acquire failed", key);
throw new RuntimeException(limit.msg());
}
}
return joinPoint.proceed();
}
}Apply the annotation on a controller or service method:
@Limit(key = "query", permitsPerSeconds = 1, timeout = 1, msg = "Rate limit triggered, please retry")
public Result query(...) { ... }5. Summary
The discussed techniques are application‑level rate limiting suitable for a single instance. For multi‑node deployments, distributed or edge‑level rate limiting is required. Combining rate limiting with circuit breaking and degradation strategies ensures robust, high‑availability services.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
