Mastering Rate Limiting in Java: From Guava Token Buckets to Redis Interceptors
This article explains why rate limiting is essential for high‑traffic Java services, compares common strategies such as Hystrix, Sentinel, token‑bucket algorithms, and Redis‑based counters, and provides concrete Spring‑Boot code examples for implementing both local and distributed throttling.
1. Rate Limiting Operations
Why rate limit
Rate limiting prevents malicious rapid refreshes and protects services from crashing under high concurrency, especially when the application runs on external servers without hardware upgrades.
In a test using JMeter, the throughput reached about 16 000 requests per second; the client froze while the server remained alive.
Common rate‑limit methods
Netflix Hystrix
Alibaba Sentinel (open‑source)
Queue, thread pool, message queue, Kafka, middleware, and Sentinel strategies such as direct reject, warm‑up, smooth‑burst, and smooth‑warming‑up
Technical level solutions
Cache duplicate requests locally to filter repeats.
Use a load balancer like Nginx.
Cache hot data in Redis or Elasticsearch.
Leverage connection pools.
Business level solution
Add user‑visible queuing to throttle traffic.
2. Application‑Level Rate Limiting Implementations
Method 1: Guava RateLimiter (token‑bucket algorithm) – supports smooth burst and smooth warm‑up limits.
<!-- Java project core library -->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>23.0</version>
</dependency>
package com.citydo.dialogue.controller;
import com.google.common.util.concurrent.RateLimiter;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RestController;
@RestController
public class HomeController {
// 10 permits per second
private RateLimiter limiter = RateLimiter.create(10.0);
@GetMapping("/test/{name}")
public String test(@PathVariable("name") String name) {
double acquire = limiter.acquire();
if (acquire >= -1e-6 && acquire <= 1e-6) {
return name;
} else {
return "操作太频繁"; // Too frequent
}
}
}This works similarly to QPS control: when QPS exceeds a threshold, the system takes action.
Direct reject – immediately refuse new requests (e.g., throw an exception or return 404) when the threshold is crossed.
Method 2: Redis counter per request – increment a key (e.g., IP+timestamp) with an expiration time.
Interceptor configuration:
package com.citydo.dialogue.config;
import com.citydo.dialogue.service.AccessLimitInterceptor;
import org.springframework.context.annotation.Configuration;
import org.springframework.web.servlet.config.annotation.InterceptorRegistry;
import org.springframework.web.servlet.config.annotation.WebMvcConfigurationSupport;
@Configuration
public class InterceptorConfig extends WebMvcConfigurationSupport {
@Override
public void addInterceptors(InterceptorRegistry registry) {
registry.addInterceptor(new AccessLimitInterceptor())
.addPathPatterns("/**");
}
}Interceptor logic (checks @AccessLimit annotation, uses Redis to count, rejects if over limit):
package com.citydo.dialogue.service;
import com.citydo.dialogue.entity.AccessLimit;
import com.citydo.dialogue.utils.IpUtil;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.data.redis.core.RedisTemplate;
import org.springframework.web.method.HandlerMethod;
import org.springframework.web.servlet.HandlerInterceptor;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import java.util.concurrent.TimeUnit;
public class AccessLimitInterceptor implements HandlerInterceptor {
@Autowired
private RedisTemplate<String, Integer> redisTemplate;
@Override
public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception {
if (handler instanceof HandlerMethod) {
HandlerMethod hm = (HandlerMethod) handler;
AccessLimit al = hm.getMethodAnnotation(AccessLimit.class);
if (al == null) return true;
int limit = al.limit();
int sec = al.sec();
String key = IpUtil.getIpAddr(request) + request.getRequestURI();
Integer count = redisTemplate.opsForValue().get(key);
if (count == null) {
redisTemplate.opsForValue().set(key, 1, sec, TimeUnit.SECONDS);
} else if (count < limit) {
redisTemplate.opsForValue().set(key, count + 1, sec, TimeUnit.SECONDS);
} else {
response.setContentType("application/json;charset=UTF-8");
response.getOutputStream().write("请求太频繁!".getBytes("UTF-8"));
return false;
}
}
return true;
}
}Custom annotation definition:
package com.citydo.dialogue.entity;
import java.lang.annotation.*;
@Inherited
@Documented
@Target({ElementType.METHOD, ElementType.TYPE, ElementType.FIELD})
@Retention(RetentionPolicy.RUNTIME)
public @interface AccessLimit {
int limit() default 5; // max requests
int sec() default 5; // time window in seconds
}Redis configuration bean to avoid encoding issues:
package com.citydo.dialogue.config;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.redis.connection.RedisConnectionFactory;
import org.springframework.data.redis.core.RedisTemplate;
import org.springframework.data.redis.serializer.GenericJackson2JsonRedisSerializer;
import org.springframework.data.redis.serializer.StringRedisSerializer;
@Configuration
public class RedisConfig {
@Bean
public RedisTemplate<Object, Object> redisTemplate(RedisConnectionFactory factory) {
RedisTemplate<Object, Object> template = new RedisTemplate<>();
template.setConnectionFactory(factory);
template.setKeySerializer(new StringRedisSerializer());
template.setValueSerializer(new GenericJackson2JsonRedisSerializer());
template.setHashKeySerializer(new GenericJackson2JsonRedisSerializer());
template.setHashValueSerializer(new GenericJackson2JsonRedisSerializer());
template.afterPropertiesSet();
return template;
}
}Method 3: Distributed rate limiting – use Redis + Lua scripts or Nginx + Lua for atomic throttling.
Method 4: Pooling techniques – limit total resources via connection pools or thread pools (e.g., max 100 DB connections).
Method 5: Tomcat connector limits – configure maxThreads, maxConnections, and acceptCount to cap concurrency.
Method 6: AtomicLong counter – simple in‑process counter with try/finally to decrement after handling.
try {
if (atomic.incrementAndGet() > limit) {
// reject request
} else {
// process request
}
} finally {
atomic.decrementAndGet();
}Method 7: Guava Cache with time window – use a LoadingCache that expires entries after a short period to count requests per second.
LoadingCache<Long, AtomicLong> counter = CacheBuilder.newBuilder()
.expireAfterWrite(2, TimeUnit.SECONDS)
.build(new CacheLoader<Long, AtomicLong>() {
@Override
public AtomicLong load(Long key) { return new AtomicLong(0); }
});
long limit = 1000;
while (true) {
long currentSec = System.currentTimeMillis() / 1000;
if (counter.get(currentSec).incrementAndGet() > limit) {
System.out.println("限流了: " + currentSec);
continue;
}
// business logic
}Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
