Rate Limiting Strategies and Guava RateLimiter for High Concurrency Traffic
This article explains the concept of high traffic, compares common mitigation techniques such as caching, degradation and rate limiting, and then details four classic rate‑limiting algorithms—counter, sliding window, leaky bucket and token bucket—followed by a practical Guava RateLimiter example and a brief note on distributed scenarios.
In real projects the author has encountered peaks of over 50,000 QPS and even 100,000+ QPS under load testing, prompting a discussion of thoughts on high‑concurrency traffic control.
What is considered "high traffic"? It is not a fixed number; any request volume that stresses the system and degrades performance can be regarded as high traffic.
Common ways to handle high traffic: caching (bringing data closer to the program to reduce DB hits), degradation (downgrading non‑core services), and rate limiting (restricting the number of requests in a given time window to protect the system).
When caching and degradation cannot solve the problem—e.g., during e‑commerce flash sales where write operations are core and cannot be downgraded—rate limiting becomes essential.
Common Rate‑Limiting Techniques
Typical algorithms include counters, sliding windows, leaky buckets, and token buckets.
Counter
A simple algorithm that counts requests within a time interval and resets the counter at the interval boundary. The author includes an illustration and notes the "time‑boundary" issue where a burst of requests at the exact reset moment can overload the system.
Sliding Window
Improves on the counter by dividing time into fixed slots that slide forward, avoiding the boundary problem. The article provides a diagram and explains that the number of slots determines the algorithm’s precision.
Leaky Bucket
Uses a fixed‑capacity bucket where incoming request rate is variable but the outflow rate is constant; excess requests overflow. An illustration and code example are shown.
Token Bucket
Generates tokens at a constant rate while request consumption is unrestricted, allowing short bursts while maintaining overall rate control. Diagrams and code examples are provided, and the author explains that both token‑bucket rejections and leaky‑bucket overflows aim to protect the majority of traffic.
Rate‑Limiting Tool: Guava RateLimiter
Guava offers a ready‑made API based on the token‑bucket algorithm. By specifying the desired QPS, RateLimiter fills the bucket with tokens, and callers obtain permits via tryAcquire(). A code snippet image demonstrates usage.
Rate Limiting in Distributed Scenarios
The discussed methods are primarily for single‑machine environments; distributed rate limiting often combines technologies such as Nginx+Lua or Redis+Lua, but this article focuses on single‑node solutions.
In conclusion, the author wraps up the discussion on traffic control techniques.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
