Implementing Rate Limiting in Spring Cloud Gateway Using Redis
This article explains how to configure distributed rate limiting in Spring Cloud Gateway by adding the reactive Redis dependency, setting up Redis connection properties, defining a RequestRateLimiter filter with token‑bucket parameters, implementing a custom KeyResolver, and demonstrating the behavior with HTTP 429 and 403 responses.
The article introduces the need for moving request rate limiting from business services to the gateway layer to reduce load on services and improve scalability, highlighting that Spring Cloud Gateway (SCG) provides a built‑in RequestRateLimiterGatewayFilterFactory filter for this purpose.
To enable distributed rate limiting, a reactive Redis client is required. The Maven dependency to add is:
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-redis-reactive</artifactId>
</dependency>After adding the dependency, configure the Redis host, port, database, timeout, and password in application.yml (or application.properties ).
spring:
redis:
host: 172.31.128.158
port: 6379
database: 0
connect-timeout: 5000
password: rootNext, define a route that applies the rate‑limiting filter. The example uses a /user/** path and configures the filter as follows:
@Bean
public RouteLocator routeLocator(RouteLocatorBuilder builder) {
return builder.routes()
.route(r -> r.path("/user/**")
.filters(f -> f.requestRateLimiter()
.rateLimiter(RedisRateLimiter.class, c -> c.setReplenishRate(1).setBurstCapacity(10).setRequestedTokens(5))
.configure(c -> c.setKeyResolver(apiTokenKeyResolver()).setDenyEmptyKey(true)))
.uri("lb://user-api"))
.build();
}The three main parameters of the token‑bucket algorithm are:
Replenish rate : how many tokens are added per second.
Burst capacity : maximum number of tokens the bucket can hold, allowing short bursts.
Requested tokens : how many tokens a single request consumes.
In the example, one token is added per second, each request consumes five tokens, and the bucket can hold up to ten tokens, meaning a steady rate of one request every five seconds with the ability to handle two burst requests.
A custom KeyResolver determines the rate‑limiting key. The provided implementation extracts a token from request headers or query parameters and combines it with the request path:
// Interface definition
public interface KeyResolver {
Mono
resolve(ServerWebExchange exchange);
}
// Example implementation
KeyResolver apiTokenKeyResolver() {
return exchange -> {
HttpHeaders headers = exchange.getRequest().getHeaders();
MultiValueMap
queryParam = exchange.getRequest().getQueryParams();
List
token = headers.get("token") != null ? headers.get("token") : queryParam.get("token");
if (token == null || token.isEmpty()) {
return Mono.empty();
}
return Mono.just(token.get(0) + "-" + exchange.getRequest().getURI().getPath());
};
}If setDenyEmptyKey(true) is set, requests without a resolved key are rejected with HTTP 403; otherwise they are allowed.
Running the application and repeatedly calling the configured endpoint shows successful requests until the rate limit is hit, after which the gateway returns HTTP 429 (Too Many Requests). Requests without a token receive HTTP 403, demonstrating the effect of the denyEmptyKey setting.
The article concludes by summarizing the configuration: a token bucket with a replenish rate of 1 token/s, each request consuming 5 tokens, a burst capacity of 10 tokens, and a key composed of the request token and path, with empty keys denied. The next tutorial will cover dynamic routing with Nacos.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.