How to Implement Effective Rate Limiting with Guava and Redis
This article explains why rate limiting is essential for high‑traffic services, describes the token‑bucket algorithm, shows how to use Guava's RateLimiter and Cache for single‑node limits, and presents a Redis‑based solution that works across distributed instances.
Premise
Take the recent Double Eleven "coupon grabbing" event as an example: the activity starts at a fixed time, attracts hundreds of millions of users, but the service interface can only handle a limited number of requests per second. When demand exceeds capacity, we must control user access through rate limiting.
Production Environment
The service interface can provide a maximum of 500 requests per second.
User request volume is unpredictable and may reach 800–1000 QPS or higher.
Requests exceeding the limit are rejected, causing loss of data.
The deployment consists of multiple nodes, but they all call the same service interface.
To ensure service availability, the call rate of the interface must be limited.
What Is Rate Limiting?
Rate limiting controls the inbound and outbound traffic of a system to prevent resource exhaustion and instability caused by excessive traffic.
A rate‑limiting system consists of two main functions: a limiting strategy and a circuit‑breaker strategy. This article focuses on the limiting strategy.
Rate Limiting Algorithms
1. Limit Instantaneous Concurrency
Guava's RateLimiter provides a token‑bucket implementation with two modes: SmoothBursty and SmoothWarmingUp.
2. Limit Maximum Requests Within a Time Window
This approach restricts the number of calls to an interface per second, minute, or day. For example, a product‑detail service may be called by many other systems, and we need to cap its QPS to avoid overload. One simple implementation uses Guava's Cache to store counters with a 2‑second expiration, using the current second as the key.
3. Token Bucket
Algorithm description:
If the configured average token generation rate is r , a token is added to the bucket every 1/r seconds.
The bucket can hold up to b tokens; excess tokens are discarded.
Incoming traffic consumes tokens at rate v . Requests that obtain a token are allowed; those that cannot are rejected (circuit‑breaker logic).
Properties
Long‑term traffic rate stabilizes at the token generation rate r .
The bucket's capacity smooths burst traffic.
Advantages: Traffic becomes smoother and the system can absorb short bursts.
4. Google Guava RateLimiter
The quickest way is to use Guava's RateLimiter, which also implements the token‑bucket algorithm. However, it only limits traffic on a single node; in a distributed system each node would have the same QPS, resulting in a total QPS multiplied by the number of nodes, so this approach is unsuitable for distributed environments.
5. Redis‑Based Implementation
Store two keys in Redis: one for timing and one for counting. Each request increments the counter; if the counter stays below the threshold within the timer window, the request is processed. Because the timer and counter are globally unique in Redis, this method provides precise flow control across multiple instances.
The actual code is omitted for company confidentiality.
References
Design of a Redis‑based rate‑limiting system: https://www.zybuluo.com/kay2/note/949160
Example implementations can be found at: https://github.com/wukq/rate-limiter
Source: http://www.54tianzhisheng.cn/2017/11/18/flow-control/
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
