Mastering Rate Limiting in Java Microservices: From Guava to Sentinel and Redis
Explore comprehensive strategies for implementing rate limiting in Java microservice architectures, covering Dubbo and Spring Cloud governance, token bucket, semaphore, Sentinel, Redis+Lua, and custom Spring Boot starter solutions, complete with code samples, configuration steps, and performance testing guidance.
Background
Rate limiting is crucial in microservice systems to prevent a single service from becoming a hidden avalanche factor that can exhaust JVM resources.
Rate Limiting Overview
When designing microservice architectures, the choice of governance framework (Dubbo, Spring Cloud, Spring Boot) influences the rate‑limiting solution.
Dubbo Service Governance
Dubbo provides built‑in mechanisms for client‑side and server‑side limiting, such as semaphore and connection‑based limits.
Spring Cloud Service Governance
Spring Cloud and Spring Cloud Alibaba include components like Hystrix and Sentinel for out‑of‑the‑box rate limiting.
Gateway Layer Limiting
Applying rate limiting at the API gateway protects the entire system from malicious traffic and crawlers.
Common Rate‑Limiting Algorithms
Token Bucket – the most widely used algorithm.
Leaky Bucket – similar to token bucket but processes requests as a queue.
Sliding Time Window – counts requests in a moving window to smooth spikes.
General Implementation Approaches
Using AOP and custom annotations, developers can embed rate limiting directly into service methods.
Guava‑Based Limiting
Guava’s RateLimiter implements a token‑bucket algorithm.
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>23.0</version>
</dependency>Define a custom annotation:
@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
public @interface TokenBucketLimiter {
int value() default 50;
}Implement an AOP class that creates a RateLimiter per method and returns a JSON error when the limit is exceeded.
@Aspect
@Component
public class GuavaLimiterAop {
private final Map<String, RateLimiter> rateLimiters = new ConcurrentHashMap<>();
@Pointcut("@annotation(com.congge.annotation.TokenBucketLimiter)")
public void pointcut() {}
@Around("pointcut()")
public Object around(ProceedingJoinPoint joinPoint) throws Throwable {
Method method = // obtain method …
TokenBucketLimiter limiter = method.getAnnotation(TokenBucketLimiter.class);
String key = // unique key per method
RateLimiter rl = rateLimiters.computeIfAbsent(key, k -> RateLimiter.create(limiter.value()));
if (rl.tryAcquire()) {
return joinPoint.proceed();
}
// write JSON response "limit exceeded"
HttpServletResponse resp = ((ServletRequestAttributes) RequestContextHolder.getRequestAttributes()).getResponse();
resp.setContentType("application/json;charset=UTF-8");
resp.getWriter().write("{\"success\":false,\"msg\":\"limit exceeded\"}");
return null;
}
}Sentinel‑Based Limiting
Sentinel provides flow control, circuit breaking, and hot‑spot protection.
<dependency>
<groupId>com.alibaba.csp</groupId>
<artifactId>sentinel-core</artifactId>
<version>1.8.0</version>
</dependency>Custom annotation:
@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
public @interface SentinelLimiter {
String resourceName();
int limitCount() default 50;
}AOP implementation loads a FlowRule for the resource and invokes SphU.entry. If a BlockException occurs, the request is rejected.
@Aspect
@Component
public class SentinelLimiterAop {
private void initFlowRule(String resourceName, int limitCount) {
FlowRule rule = new FlowRule();
rule.setResource(resourceName);
rule.setGrade(RuleConstant.FLOW_GRADE_QPS);
rule.setCount(limitCount);
FlowRuleManager.loadRules(Collections.singletonList(rule));
}
@Pointcut("@annotation(com.congge.annotation.SentinelLimiter)")
public void pointcut() {}
@Around("pointcut()")
public Object around(ProceedingJoinPoint joinPoint) throws Throwable {
Method method = // obtain method …
SentinelLimiter ann = method.getAnnotation(SentinelLimiter.class);
initFlowRule(ann.resourceName(), ann.limitCount());
Entry entry = null;
try {
entry = SphU.entry(ann.resourceName());
return joinPoint.proceed();
} catch (BlockException e) {
return "blocked"; // or custom JSON response
} finally {
if (entry != null) entry.exit();
}
}
}Redis + Lua Limiting
Redis’s atomic operations and Lua scripting enable a distributed token bucket.
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>Annotation definition:
@Target({ElementType.METHOD, ElementType.TYPE})
@Retention(RetentionPolicy.RUNTIME)
public @interface RedisLimitAnnotation {
String key() default "";
String prefix() default "";
int count();
int period();
LimitType limitType() default LimitType.CUSTOMER;
}Lua script (limit.lua):
local key = "rate.limit:" .. KEYS[1]
local limit = tonumber(ARGV[1])
local current = tonumber(redis.call('get', key) or "0")
if current + 1 > limit then
return 0
else
redis.call("INCRBY", key, "1")
redis.call("expire", key, "2")
return current + 1
endThe AOP class executes the script via RedisTemplate and throws an exception when the limit is reached.
@Aspect
@Component
public class RedisLimiterAop {
@Autowired
private RedisTemplate<String, Object> redisTemplate;
@Autowired
private DefaultRedisScript<Number> redisScript;
@Pointcut("@annotation(com.congge.annotation.RedisLimitAnnotation)")
public void pointcut() {}
@Around("pointcut()")
public Object around(ProceedingJoinPoint joinPoint) throws Throwable {
Method method = // obtain method …
RedisLimitAnnotation ann = method.getAnnotation(RedisLimitAnnotation.class);
String key = ann.prefix() + ann.key();
Number count = redisTemplate.execute(redisScript, Collections.singletonList(key), ann.count(), ann.period());
if (count != null && count.intValue() <= ann.count()) {
return joinPoint.proceed();
}
throw new RuntimeException("Rate limit exceeded");
}
}Custom Spring Boot Starter
To avoid duplicating rate‑limiting code across services, the tutorial packages the annotations and AOP classes into a starter JAR. The starter is auto‑configured via spring.factories and can be added as a Maven dependency.
# spring.factories
org.springframework.boot.autoconfigure.EnableAutoConfiguration=\
com.congge.aop.SemaphoreLimiterAop,\
com.congge.aop.GuavaLimiterAop,\
com.congge.aop.SentinelLimiterAopConsumers simply add the starter dependency and annotate methods with @TokenBucketLimiter, @ShLimiter, or @SentinelLimiter to obtain ready‑made rate limiting.
Testing and Results
Sample controllers demonstrate each approach. When the QPS exceeds the configured threshold (e.g., 1 request per second), the response switches to a JSON error or a plain text message indicating that the request was throttled.
Conclusion
Rate limiting is an essential part of service governance. By combining token‑bucket, semaphore, Sentinel, and Redis‑Lua techniques, and by encapsulating them in a Spring Boot starter, developers can protect microservices from overload while keeping the implementation reusable and maintainable.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
