Backend Development 7 min read

Rate Limiting Strategies and Guava RateLimiter for High Concurrency Traffic

This article explains the concept of high traffic, compares common mitigation techniques such as caching, degradation and rate limiting, and then details four classic rate‑limiting algorithms—counter, sliding window, leaky bucket and token bucket—followed by a practical Guava RateLimiter example and a brief note on distributed scenarios.

Selected Java Interview Questions

Jun 28, 2020

Rate Limiting Strategies and Guava RateLimiter for High Concurrency Traffic

In real projects the author has encountered peaks of over 50,000 QPS and even 100,000+ QPS under load testing, prompting a discussion of thoughts on high‑concurrency traffic control.

What is considered "high traffic"? It is not a fixed number; any request volume that stresses the system and degrades performance can be regarded as high traffic.

Common ways to handle high traffic: caching (bringing data closer to the program to reduce DB hits), degradation (downgrading non‑core services), and rate limiting (restricting the number of requests in a given time window to protect the system).

When caching and degradation cannot solve the problem—e.g., during e‑commerce flash sales where write operations are core and cannot be downgraded—rate limiting becomes essential.

Common Rate‑Limiting Techniques

Typical algorithms include counters, sliding windows, leaky buckets, and token buckets.

Counter

A simple algorithm that counts requests within a time interval and resets the counter at the interval boundary. The author includes an illustration and notes the "time‑boundary" issue where a burst of requests at the exact reset moment can overload the system.

Sliding Window

Improves on the counter by dividing time into fixed slots that slide forward, avoiding the boundary problem. The article provides a diagram and explains that the number of slots determines the algorithm’s precision.

Leaky Bucket

Uses a fixed‑capacity bucket where incoming request rate is variable but the outflow rate is constant; excess requests overflow. An illustration and code example are shown.

Token Bucket

Generates tokens at a constant rate while request consumption is unrestricted, allowing short bursts while maintaining overall rate control. Diagrams and code examples are provided, and the author explains that both token‑bucket rejections and leaky‑bucket overflows aim to protect the majority of traffic.

Rate‑Limiting Tool: Guava RateLimiter

Guava offers a ready‑made API based on the token‑bucket algorithm. By specifying the desired QPS, RateLimiter fills the bucket with tokens, and callers obtain permits via tryAcquire(). A code snippet image demonstrates usage.

Rate Limiting in Distributed Scenarios

The discussed methods are primarily for single‑machine environments; distributed rate limiting often combines technologies such as Nginx+Lua or Redis+Lua, but this article focuses on single‑node solutions.

In conclusion, the author wraps up the discussion on traffic control techniques.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems high concurrency Guava rate limiting

Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.