Mastering Rate Limiting: Algorithms, Strategies, and Practical Guava & Nginx Implementations

This article explains why rate limiting is essential for system stability, compares it with caching and degradation, details three core algorithms—counter, leaky bucket, and token bucket—and provides concrete Guava, Java, and Nginx + Lua code examples for implementing both local and distributed throttling.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Mastering Rate Limiting: Algorithms, Strategies, and Practical Guava & Nginx Implementations

Why Rate Limiting Is Needed

Sudden traffic spikes can overwhelm a service, causing degraded performance or complete outage. Limiting the number of concurrent requests protects system stability and ensures a predictable user experience.

Rate‑Limiting Concepts

Circuit Breaker

When a service detects unrecoverable errors, it opens a circuit breaker to reject incoming traffic. Once the backend recovers, the breaker closes and normal traffic resumes. Common implementations include Hystrix and Alibaba Sentinel.

Service Degradation

Non‑critical features (e.g., product reviews, loyalty points) are temporarily disabled during traffic surges, freeing resources for core functionality while returning graceful fallback data.

Delay Handling (Buffering)

Requests are placed into a buffer (e.g., a queue) and processed later, reducing immediate pressure on the backend. This principle underlies leaky‑bucket and token‑bucket algorithms.

Privilege Handling

Users are classified into priority groups; high‑priority traffic receives preferential treatment while lower‑priority traffic may be delayed or rejected.

Cache vs. Degradation vs. Rate Limiting

Cache increases throughput by storing frequently accessed data. Degradation disables failing components and provides fallback responses. Rate limiting caps request rates when caching and degradation are insufficient, protecting services before they become unavailable.

Rate‑Limiting Algorithms

Counter (Fixed‑Window) Algorithm

A simple method that defines a maximum number of requests per time window (e.g., 100 requests per minute). A counter increments with each request; if the count exceeds the limit before the window expires, the request is rejected. The counter resets when the window ends.

Counter algorithm illustration
Counter algorithm illustration

Leaky Bucket Algorithm

Incoming requests enter a bucket that leaks at a constant rate. If the bucket is full, excess requests are dropped, smoothing burst traffic and enforcing a steady output rate.

Leaky bucket illustration
Leaky bucket illustration

Token Bucket Algorithm

A bucket holds tokens that are added at a fixed rate. A request proceeds only if a token is available; otherwise it is rejected. Unused tokens accumulate, allowing short bursts while maintaining an average rate.

Token bucket illustration
Token bucket illustration

Concurrency Limiting

Limit total concurrency (e.g., database connection pool, thread pool).

Limit instantaneous connections (e.g., Nginx limit_conn).

Limit average request rate within a time window (e.g., Guava RateLimiter, Nginx limit_req).

Limit remote‑API call rates or message‑queue consumption rates.

Adjust limits dynamically based on CPU, memory, or network load.

Proper concurrency limiting prevents crashes during traffic spikes.

Interface Limiting

Fixed‑window counting (counter algorithm) for total request count per interval.

Sliding‑window counting for finer‑grained control, dividing the interval into smaller slots (milliseconds or nanoseconds) to smooth bursts at the cost of higher memory usage.

Sliding windows provide more accurate throttling by continuously updating counts across sub‑intervals.

Implementation Examples

Guava RateLimiter (Java)

Dependency com.google.guava:guava:28.1-jre Basic token‑bucket usage

RateLimiter limiter = RateLimiter.create(2); // 2 tokens per second
System.out.println(limiter.acquire()); // blocks until a token is available
Thread.sleep(2000);
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());

The call RateLimiter.create(2) sets the token generation rate; unused tokens accumulate, enabling short bursts.

Smooth warm‑up mode (gradually ramps token generation from a cold start)

RateLimiter limiter = RateLimiter.create(2, 1000L, TimeUnit.MILLISECONDS);
// subsequent acquire calls behave as above

Timeout try‑acquire

boolean acquired = limiter.tryAcquire(Duration.ofMillis(11));

Returns true if a token is obtained within the specified timeout.

Distributed Limiting with Nginx + Lua

Uses lua‑resty‑lock for atomic operations and ngx.shared.DICT for shared counters.

local lock = require "resty.lock"
function acquire()
    local l = lock:new("locks")
    local ok, err = l:lock("limit_key")   -- atomic lock
    if not ok then return 0 end
    local dict = ngx.shared.limit_counter
    local key = "ip:" .. ngx.now()
    local limit = 5
    local cur = dict:get(key)
    if cur and cur + 1 > limit then
        l:unlock()
        return 0
    end
    if not cur then
        dict:set(key, 1, 1)   -- expires in 1 second
    else
        dict:incr(key, 1)
    end
    l:unlock()
    return 1
end
ngx.print(acquire())

Repository for the lock library: https://github.com/openresty/lua-resty-lock

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendDistributed SystemsalgorithmGuavaNGINXrate limiting
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.