How to Master Rate Limiting in Microservices: From Dubbo to Redis and Sentinel
This article walks through the importance of rate limiting in microservice architectures, compares Dubbo, Spring Cloud, and gateway approaches, explains token‑bucket, leaky‑bucket and sliding‑window algorithms, and provides step‑by‑step AOP implementations using Guava, Sentinel, and Redis with full code samples.
Background
In a microservice system, uncontrolled traffic to a single low‑level service can become a hidden avalanche, exhausting JVM resources and causing request timeouts. For example, a SaaS platform with over 100 services may see a few core services invoked thousands of times per second during peak load, leading to blocked queues and eventual crashes.
Rate‑Limiting Overview
When choosing a service‑governance framework, architects must consider the current product landscape. The choice between Dubbo and Spring Cloud influences the available rate‑limiting mechanisms.
2.1 Dubbo Service‑Governance Mode
Dubbo uses Netty under the hood, offering advantages over HTTP in certain scenarios. Its built‑in governance supports two categories of limits:
Client‑side limits: semaphore‑based counting and connection‑count (socket‑>tcp) limits.
Server‑side limits: thread‑pool isolation, semaphore (non‑isolation), and receive‑count (socket‑>tcp) limits.
2.2 Spring Cloud Service‑Governance Mode
Spring Cloud and Spring Cloud Alibaba already bundle several rate‑limiting components:
Hystrix : provides thread‑isolation limits configurable by thread count and queue size.
Sentinel : a traffic‑defense guard that supports flow control, circuit breaking, hot‑spot protection, and system load protection.
Hystrix defaults to thread isolation; you can configure thread number and queue size to achieve rate limiting.
2.3 Gateway‑Level Limiting
When many services need protection, placing a rate limiter at the API gateway filters malicious traffic, crawlers, and attacks before they reach downstream services.
Common Rate‑Limiting Strategies
3.1 Token‑Bucket Algorithm
The token‑bucket algorithm uses two key elements:
Token : a request can proceed only if it acquires a token; otherwise it is queued or dropped.
Bucket : stores tokens; all requests draw from this bucket.
The process consists of token generation and token acquisition.
3.2 Leaky‑Bucket Algorithm
Similar to token‑bucket, but the bucket holds request packets instead of tokens. When the bucket is full, new packets are discarded.
3.3 Sliding Time Window
A sliding window of, for example, 5 seconds moves forward with time. The window is divided into one‑second slots, each counting requests. The total count is the sum of all slots. If the limit is 20 requests per window, the count drops as the oldest slot exits the window, smoothing traffic spikes.
General Implementation Approaches
Beyond framework‑specific limits, a common technique is to use AOP with a custom annotation to intercept targeted methods and apply a chosen limiter.
4.1 Guava‑Based Limiting
Guava provides a token‑bucket implementation via RateLimiter. The steps are:
Add Guava dependency.
Create a custom annotation @RateConfigAnno with limitType and limitCount.
Implement an AOP class that extracts the annotation, obtains or creates a RateLimiter, and calls tryAcquire(). If acquisition fails, return a JSON response indicating rate limiting.
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>23.0</version>
</dependency> import java.lang.annotation.*;
@Target(ElementType.METHOD)
@Retention(RetentionPolicy.RUNTIME)
public @interface RateConfigAnno {
String limitType();
double limitCount() default 5d;
} @Aspect
@Component
public class GuavaLimitAop {
private static final Logger logger = LoggerFactory.getLogger(GuavaLimitAop.class);
@Before("execution(@RateConfigAnno * *(..))")
public void limit(JoinPoint joinPoint) {
Method currentMethod = getCurrentMethod(joinPoint);
if (currentMethod == null) return;
String limitType = currentMethod.getAnnotation(RateConfigAnno.class).limitType();
double limitCount = currentMethod.getAnnotation(RateConfigAnno.class).limitCount();
RateLimiter rateLimiter = RateLimitHelper.getRateLimiter(limitType, limitCount);
if (!rateLimiter.tryAcquire()) {
// return rate‑limit JSON response
}
}
// getCurrentMethod and output helper omitted for brevity
}4.2 Sentinel‑Based Limiting
Sentinel can be used either with Spring Cloud Alibaba or as a native SDK. The AOP flow mirrors the Guava approach but uses Sentinel's SphU.entry and catches BlockException when the limit is exceeded.
@Aspect
@Component
public class SentinelMethodLimitAop {
@Pointcut("@annotation(com.congge.sentinel.SentinelLimitAnnotation)")
public void rateLimit() {}
@Around("rateLimit()")
public Object around(ProceedingJoinPoint joinPoint) throws Throwable {
Method method = getCurrentMethod(joinPoint);
String resource = method.getAnnotation(SentinelLimitAnnotation.class).resourceName();
int limit = method.getAnnotation(SentinelLimitAnnotation.class).limitCount();
initFlowRule(resource, limit);
Entry entry = null;
try {
entry = SphU.entry(resource);
return joinPoint.proceed();
} catch (BlockException e) {
System.out.println("blocked");
return "被限流了";
} finally {
if (entry != null) entry.exit();
}
}
// initFlowRule and getCurrentMethod omitted for brevity
}4.3 Redis + Lua Limiting
Redis offers atomic operations and can execute a Lua script to enforce a sliding‑window counter. The workflow:
Write a Lua script that increments a key and sets an expiration.
Define a custom annotation @RedisLimitAnnotation with key, count, period, and limit type.
Configure RedisTemplate and DefaultRedisScript<Number> beans.
In an AOP interceptor, build a composite key (IP + class + method), invoke the script, and allow the request only if the returned count is within the limit.
local key = "rate.limit:" .. KEYS[1]
local limit = tonumber(ARGV[1])
local current = tonumber(redis.call('get', key) or "0")
if current + 1 > limit then
return 0
else
redis.call('INCRBY', key, "1")
redis.call('expire', key, "2")
return current + 1
endPackaging as a Spring Boot Starter
To avoid duplicating limiter code across services, the article shows how to bundle the annotations, AOP classes, and auto‑configuration into a reusable JAR:
Project structure: annotation (custom annotations), aop (implementations), spring.factories (auto‑configuration entries).
Dependencies include Spring Boot starter, Guava, Sentinel core, and Redis starter.
Publish the JAR and import it in other microservices via Maven.
After adding the starter, developers can simply annotate a controller method with @TokenBucketLimiter(1), @ShLimiter(1), or @SentinelLimiter(resourceName="myRes", limitCount=1) and the corresponding rate‑limit logic is applied automatically.
Conclusion
The article demonstrates that rate limiting is a critical guard for microservice stability and provides three concrete, production‑ready implementations—Guava token bucket, Sentinel flow control, and Redis‑Lua sliding window—each packaged as reusable Spring Boot starter components, enabling teams to adopt a consistent strategy across diverse services.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
