Backend Development 17 min read

Rate Limiting Algorithms and Their Java Implementations

Rate limiting protects system stability by capping request rates, and this article explains five Java algorithms—Fixed Window, Sliding Window, Leaky Bucket, Token Bucket, and Guava's RateLimiter—showing their principles, pros and cons, and providing sample implementations and a Spring @Limit annotation for practical enforcement.

Amap Tech

Dec 28, 2023

Rate Limiting Algorithms and Their Java Implementations

Rate limiting is a crucial technique to ensure system resilience by preventing overload caused by sudden traffic spikes. It may discard some user requests, but this is generally acceptable compared to system crashes.

1. Fixed Window Algorithm

Implementation principle: count the number of requests within a fixed time window and reset the counter when the window expires. For example, limit 150 requests in a 3‑second window.

public class FixedWindowRateLimiter {
    Logger logger = LoggerFactory.getLogger(FixedWindowRateLimiter.class);
    // time window size in ms
    long windowSize;
    // allowed request count
    int maxRequestCount;
    // current window request count
    AtomicInteger counter = new AtomicInteger(0);
    // window right border
    long windowBorder;
    public FixedWindowRateLimiter(long windowSize, int maxRequestCount) {
        this.windowSize = windowSize;
        this.maxRequestCount = maxRequestCount;
        this.windowBorder = System.currentTimeMillis() + windowSize;
    }
    public synchronized boolean tryAcquire() {
        long currentTime = System.currentTimeMillis();
        if (windowBorder < currentTime) {
            logger.info("window reset");
            do {
                windowBorder += windowSize;
            } while (windowBorder < currentTime);
            counter = new AtomicInteger(0);
        }
        if (counter.intValue() < maxRequestCount) {
            counter.incrementAndGet();
            logger.info("tryAcquire success");
            return true;
        } else {
            logger.info("tryAcquire fail");
            return false;
        }
    }
}

Advantages: simple to implement and easy to understand.

Disadvantages: not smooth (e.g., if three requests are sent in the first millisecond, the rest of the window is blocked) and suffers from window‑boundary effects that can allow a burst of requests at the edge of two windows.

2. Sliding Window Algorithm

Improves the fixed window by dividing the window into multiple smaller sub‑windows that slide forward. This solves the boundary problem while keeping the total window size fixed.

public class SlidingWindowRateLimiter {
    Logger logger = LoggerFactory.getLogger(FixedWindowRateLimiter.class);
    // time window size in ms
    long windowSize;
    // number of sub‑windows
    int shardNum;
    // allowed request count
    int maxRequestCount;
    // request count per sub‑window
    int[] shardRequestCount;
    // total request count in the current window
    int totalCount;
    // current sub‑window index
    int shardId;
    // size of each sub‑window in ms
    long tinyWindowSize;
    // right border of the window
    long windowBorder;
    public SlidingWindowRateLimiter(long windowSize, int shardNum, int maxRequestCount) {
        this.windowSize = windowSize;
        this.shardNum = shardNum;
        this.maxRequestCount = maxRequestCount;
        this.shardRequestCount = new int[shardNum];
        this.tinyWindowSize = windowSize / shardNum;
        this.windowBorder = System.currentTimeMillis();
    }
    public synchronized boolean tryAcquire() {
        long currentTime = System.currentTimeMillis();
        if (windowBorder < currentTime) {
            logger.info("window reset");
            do {
                shardId = (++shardId) % shardNum;
                totalCount -= shardRequestCount[shardId];
                shardRequestCount[shardId] = 0;
                windowBorder += tinyWindowSize;
            } while (windowBorder < currentTime);
        }
        if (totalCount < maxRequestCount) {
            logger.info("tryAcquire success:{}", shardId);
            shardRequestCount[shardId]++;
            totalCount++;
            return true;
        } else {
            logger.info("tryAcquire fail");
            return false;
        }
    }
}

Advantages: eliminates the window‑boundary issue and prevents burst overload.

Disadvantages: still suffers from non‑smooth throttling (e.g., a burst of three requests in the first millisecond can block the rest of the window).

3. Leaky Bucket Algorithm

The leaky bucket is a traffic‑shaping algorithm that controls the outflow rate of requests, smoothing bursts and preventing overload.

public class LeakyBucketRateLimiter {
    Logger logger = LoggerFactory.getLogger(LeakyBucketRateLimiter.class);
    // bucket capacity
    int capacity;
    // current water amount
    AtomicInteger water = new AtomicInteger();
    // timestamp of the last leak
    long leakTimestamp;
    // leak rate (requests per second)
    int leakRate;
    public LeakyBucketRateLimiter(int capacity, int leakRate) {
        this.capacity = capacity;
        this.leakRate = leakRate;
    }
    public synchronized boolean tryAcquire() {
        if (water.get() == 0) {
            logger.info("start leaking");
            leakTimestamp = System.currentTimeMillis();
            water.incrementAndGet();
            return water.get() < capacity;
        }
        long currentTime = System.currentTimeMillis();
        int leakedWater = (int) ((currentTime - leakTimestamp) / 1000 * leakRate);
        logger.info("lastTime:{}, currentTime:{}, LeakedWater:{}", leakTimestamp, currentTime, leakedWater);
        if (leakedWater != 0) {
            int leftWater = water.get() - leakedWater;
            water.set(Math.max(0, leftWater));
            leakTimestamp = System.currentTimeMillis();
        }
        logger.info("remaining capacity:{}", capacity - water.get());
        if (water.get() < capacity) {
            logger.info("tryAcquire success");
            water.incrementAndGet();
            return true;
        } else {
            logger.info("tryAcquire fail");
            return false;
        }
    }
}

Advantages: smooths traffic, prevents overload, and discards excess requests when the bucket is full.

Disadvantages: cannot handle sudden bursts well, may lead to data loss, and can waste resources when the system is idle.

4. Token Bucket Algorithm

The token bucket improves the leaky bucket by allowing bursts while still limiting the average rate. Tokens are added to the bucket at a fixed rate; a request consumes a token, and if none are available the request is rejected.

Guava’s RateLimiter is an implementation of the token bucket and can be used directly.

5. Practical Application with Guava RateLimiter

In a Spring environment, a custom @Limit annotation together with an AOP aspect can enforce method‑level rate limiting.

@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.METHOD})
@Documented
public @interface Limit {
    // resource key
    String key() default "";
    // permits per second
    double permitsPerSeconds();
    // timeout for acquiring a permit
    long timeout();
    // time unit of the timeout
    TimeUnit timeUnit() default TimeUnit.MILLISECONDS;
    // error message
    String msg() default "系统繁忙,请稍后重试";
}

@Aspect
@Component
public class LimitAspect {
    Logger logger = LoggerFactory.getLogger(LimitAspect.class);
    private final Map<String, RateLimiter> limitMap = Maps.newConcurrentMap();
    @Around("@annotation(com.example.annotation.Limit)")
    public Object around(ProceedingJoinPoint joinPoint) throws Throwable {
        MethodSignature signature = (MethodSignature) joinPoint.getSignature();
        Method method = signature.getMethod();
        Limit limit = method.getAnnotation(Limit.class);
        if (limit != null) {
            String key = limit.key();
            RateLimiter rateLimiter = limitMap.computeIfAbsent(key, k -> {
                RateLimiter rl = RateLimiter.create(limit.permitsPerSeconds());
                logger.info("Created token bucket={}, capacity={}", k, limit.permitsPerSeconds());
                return rl;
            });
            boolean acquired = rateLimiter.tryAcquire(limit.timeout(), limit.timeUnit());
            if (!acquired) {
                logger.debug("Token bucket={}, acquire failed", key);
                throw new RuntimeException(limit.msg());
            }
        }
        return joinPoint.proceed();
    }
}

Example usage:

@Limit(key = "query", permitsPerSeconds = 1, timeout = 1, msg = "触发接口限流，请重试")
public Result query(...) { ... }

Test results show that Guava’s RateLimiter generates tokens at a steady rate (e.g., 5 tokens per second) and spreads requests evenly, making it a powerful tool for single‑node rate limiting.

For multi‑node deployments, distributed rate limiting or gateway‑level limiting is required.

In summary, rate limiting is indispensable for system robustness. It can be combined with circuit breaking and degradation strategies to ensure high availability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems Java algorithm Guava rate limiting

Written by

Amap Tech

Official Amap technology account showcasing all of Amap's technical innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.