Master Rate Limiting: Algorithms, Guava RateLimiter, and Java Implementation

This article introduces the background and necessity of rate limiting, explains the leaky bucket and token bucket algorithms with visual diagrams, and provides a comprehensive Java implementation using Guava's RateLimiter, custom annotations, AOP interception, and controller integration to protect high‑traffic applications.

ITFLY8 Architecture Home
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Master Rate Limiting: Algorithms, Guava RateLimiter, and Java Implementation

This article starts with the background of rate limiting, introduces common methods, code implementation, and source analysis of rate‑limiting components. It is the first in a series, covering the background, algorithms, and Guava RateLimiter implementation; the second part will discuss its source code.

1. Rate Limiting Background

Rate limiting is an important tool for protecting systems by restricting the number of concurrent accesses or the request rate within a time window, preventing large or burst traffic from causing service crashes. When the limit is reached, requests can be rejected or traffic can be shaped.

In practice, rate limiting can be applied at the network layer, the access layer (e.g., Nginx), or the application layer. This article focuses on application‑level rate limiting; the principles for other layers are similar, differing only in the technical means used.

Typical high‑traffic scenarios such as flash sales, seckill systems, and large e‑commerce platforms may limit thread pools, database connection pools, concurrent counts, API call rates, or MQ consumption rates, based on network connections, bandwidth, CPU or memory load, etc.

2. Rate Limiting Algorithms

The two most common basic algorithms are the leaky bucket and the token bucket.

The diagram resembles a funnel: incoming water represents request traffic, and outgoing water represents the system processing requests. When traffic exceeds capacity, water accumulates and may overflow.

2.1 Leaky Bucket Algorithm

The leaky bucket consists of a queue and a processor. Requests are placed in the queue if it is not full; the processor removes requests at a fixed rate. If the request volume exceeds the maximum limit, new requests are discarded.

Leaky bucket diagram
Leaky bucket diagram

2.2 Token Bucket Algorithm

The token bucket is similar in structure but adds tokens at a fixed rate. A request is processed only if a token is available; otherwise it is rejected or buffered.

(1) Tokens are added to the bucket at a constant rate up to a maximum capacity; excess tokens are discarded. (2) When a request arrives, it consumes a token if available; if not, the request is dropped or placed in a buffer.

Token bucket diagram
Token bucket diagram

2.3 Comparison of Token Bucket and Leaky Bucket

• The token bucket adds tokens at a fixed rate; processing depends on token availability. When tokens run out, new requests are rejected. The leaky bucket drains requests at a constant rate regardless of arrival rate; excess requests are rejected when the bucket is full.

• The token bucket limits the average inflow rate but allows bursts as long as tokens are available, supporting multiple token consumption per request. The leaky bucket enforces a constant outflow rate, smoothing burst traffic.

• The token bucket permits a certain degree of burst traffic, whereas the leaky bucket provides a smooth outflow.

3. Rate Limiting Implementation

Guava's RateLimiter provides a token‑bucket implementation with two strategies: SmoothBursty for bursty traffic and SmoothWarmingUp for warm‑up behavior. The following example demonstrates how to use RateLimiter in a Spring Boot application.

3.1 Maven Dependency

<dependency>
  <groupId>com.google.guava</groupId>
  <artifactId>guava</artifactId>
  <version>28.1-jre</version>
</dependency>

3.2 Custom Annotation

@Target({ElementType.PARAMETER, ElementType.METHOD})
@Retention(RetentionPolicy.RUNTIME)
@Documented
public @interface RequestRateLimitAnnotation {
    /** Fixed number of tokens */
    double limitNum();
    /** Token acquisition timeout */
    long timeout();
    /** Time unit, default milliseconds */
    TimeUnit timeUnit() default TimeUnit.MILLISECONDS;
    /** Error message when token cannot be obtained */
    String errMsg() default "请求太频繁!";
}

3.3 AOP Interceptor

@Slf4j
@Aspect
@Component
public class RequestRateLimitAspect {
    /** Store RateLimiter per URL to avoid recreating */
    private Map<String, RateLimiter> limitMap = Maps.newConcurrentMap();

    @Pointcut("@annotation(com.itfly8.test.annotation.RequestRateLimitAnnotation)")
    public void pointCut() {}

    @Around("pointCut()")
    public Object doAround(ProceedingJoinPoint joinPoint) throws Throwable {
        HttpServletRequest request = ((ServletRequestAttributes) RequestContextHolder.getRequestAttributes()).getRequest();
        String reqUrl = request.getRequestURI();
        RequestRateLimitAnnotation rateLimiter = this.getRequestRateLimiter(joinPoint);
        if (Objects.nonNull(rateLimiter)) {
            RateLimiter limiter = getRateLimiter(reqUrl, rateLimiter);
            boolean acquire = limiter.tryAcquire(rateLimiter.timeout(), rateLimiter.timeUnit());
            if (!acquire) {
                return new Response<>("200", reqUrl.concat(rateLimiter.errMsg()));
            }
        }
        return joinPoint.proceed();
    }

    private RateLimiter getRateLimiter(String reqUrl, RequestRateLimitAnnotation rateLimiter) {
        RateLimiter limiter = limitMap.get(reqUrl);
        if (Objects.isNull(limiter)) {
            synchronized (this) {
                limiter = limitMap.get(reqUrl);
                if (Objects.isNull(limiter)) {
                    limiter = RateLimiter.create(rateLimiter.limitNum());
                    limitMap.put(reqUrl, limiter);
                    log.info("RequestRateLimitAspect请求{},创建令牌桶,容量{} 成功", reqUrl, rateLimiter.limitNum());
                }
            }
        }
        return limiter;
    }

    private RequestRateLimitAnnotation getRequestRateLimiter(final JoinPoint joinPoint) {
        Method[] methods = joinPoint.getTarget().getClass().getDeclaredMethods();
        String name = joinPoint.getSignature().getName();
        if (!StringUtils.isEmpty(name)) {
            for (Method method : methods) {
                RequestRateLimitAnnotation annotation = method.getAnnotation(RequestRateLimitAnnotation.class);
                if (annotation != null && name.equals(method.getName())) {
                    return annotation;
                }
            }
        }
        return null;
    }
}

3.4 Controller Usage

@RestController
@RequestMapping("/test/")
public class ExampleController {
    @RequestRateLimitAnnotation(limitNum = 2, timeout = 10)
    @PostMapping("/ratelimit")
    public String testRateLimit() {
        /** Test code */
        return "success";
    }
}

3.5 Execution Result

If the request exceeds the limit, a log entry and a response similar to the following are produced:

{"code":"200","msg":"/test/ratelimit请求太频繁!","data":{}}

4. Summary

This article covered rate‑limiting scenarios, algorithms, and the use of Guava's RateLimiter, providing a relatively generic implementation that can be extended to build a comprehensive rate‑limiting mechanism for high‑concurrency applications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

aopGuavarate limiting
ITFLY8 Architecture Home
Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.