Understanding Rate Limiting: Concepts, Algorithms, and Implementations

This article explains why rate limiting is needed in both physical venues and online systems, describes common limiting strategies such as circuit breaking, service degradation, delayed processing, and privileged handling, and details three major algorithms—counter, leaky bucket, and token bucket—along with practical Java and Nginx‑Lua code examples.

Architecture Digest
Architecture Digest
Architecture Digest
Understanding Rate Limiting: Concepts, Algorithms, and Implementations

Why Rate Limiting Is Needed

Just as a tourist site limits entry to avoid overcrowding and accidents, online services must restrict traffic spikes (e.g., a celebrity’s news causing millions of visits) to keep the system usable and prevent crashes.

Rate‑Limiting Strategies

Circuit Breaking

When a service cannot recover quickly, a circuit‑breaker automatically rejects traffic, protecting downstream components. Tools like Hystrix and Alibaba Sentinel provide configurable circuit‑breaker mechanisms.

Service Degradation

Non‑essential features (e.g., product reviews, points) can be temporarily disabled during traffic surges, freeing resources for core functions such as order processing.

Delayed Processing

Requests are placed into a buffer (e.g., a queue) and processed later, reducing immediate load on the backend. This is the basis of leaky‑bucket and token‑bucket algorithms.

Privileged Processing

Users are classified, allowing high‑priority groups to receive service while others wait or are rejected.

Rate‑Limiting Algorithms

Counter Algorithm

A simple approach that counts requests within a fixed window (e.g., no more than 100 calls per minute). When the count exceeds the limit, further requests are rejected.

LoadingCache<Long, AtomicLong> counter = CacheBuilder.newBuilder()
    .expireAfterWrite(2, TimeUnit.SECONDS)
    .build(new CacheLoader<Long, AtomicLong>() {
        @Override
        public AtomicLong load(Long second) throws Exception {
            return new AtomicLong(0);
        }
    });
counter.get(1L).incrementAndGet();

Leaky Bucket Algorithm

Requests enter a bucket that leaks at a constant rate; excess requests overflow and are dropped, smoothing traffic bursts.

Token Bucket Algorithm

Tokens are added to a bucket at a steady rate; a request proceeds only if a token is available, allowing controlled bursts while enforcing an average rate.

RateLimiter limiter = RateLimiter.create(2); // 2 tokens per second
System.out.println(limiter.acquire()); // blocks until a token is available

Concurrency Limiting

System‑wide QPS thresholds can be enforced via server settings (e.g., Tomcat’s acceptCount, maxConnections, maxThreads) or framework utilities such as Guava’s RateLimiter, Nginx’s limit_conn and limit_req modules.

Interface Limiting

Limits can be applied per API endpoint using fixed windows, sliding windows, or token buckets to achieve finer‑grained control.

Implementation Examples

Guava RateLimiter

RateLimiter limiter = RateLimiter.create(2);
System.out.println(limiter.acquire());
Thread.sleep(2000);
System.out.println(limiter.acquire());

Nginx + Lua Distributed Limiting

local locks = require "resty.lock"
function acquire()
    local lock = locks:new("locks")
    local elapsed, err = lock:lock("limit_key")
    local limit_counter = ngx.shared.limit_counter
    local key = "ip:" .. os.time()
    local limit = 5
    local current = limit_counter:get(key)
    if current ~= nil and current + 1 > limit then
        lock:unlock()
        return 0
    end
    if current == nil then
        limit_counter:set(key, 1, 1)
    else
        limit_counter:incr(key, 1)
    end
    lock:unlock()
    return 1
end
ngx.print(acquire())

Key Takeaways

Rate limiting protects services from overload, but it must be carefully tuned; overly aggressive limits can degrade user experience, while insufficient limits risk system failure.

Circuit Breaker Diagram
Circuit Breaker Diagram
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed Systemsjavaalgorithmconcurrencyrate limiting
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.