Mastering Rate Limiting: Strategies, Algorithms, and Real‑World Implementations

This article explains how rate limiting protects system availability by controlling traffic flow, introduces common patterns such as circuit breaking, service degradation, delay and privilege handling, compares cache, degradation, and rate limiting, and details popular algorithms and practical code implementations for both single‑node and distributed environments.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Mastering Rate Limiting: Strategies, Algorithms, and Real‑World Implementations

Rate limiting controls traffic to keep services usable, similar to crowd control in a scenic spot where visitor numbers are capped to avoid accidents and poor experience.

Rate Limiting Approaches

Typical patterns include:

Circuit Breaker

The system pre‑defines a breaker; when a fault persists, traffic is rejected to prevent overload. Once the backend recovers, the breaker is closed. Common components are Hystrix and Alibaba Sentinel.

Service Degradation

When a problem occurs, non‑essential functions are temporarily disabled, freeing resources for core services. For example, an e‑commerce site may suspend comments or points during a traffic surge.

Delay Processing

Requests are buffered in a front‑end pool (e.g., a queue) and processed later by the backend, reducing immediate load but introducing latency.

Privilege Processing

Users are classified, and high‑priority groups receive service first while others may be delayed or rejected.

Cache, Degradation, and Rate Limiting Differences

Cache increases throughput, degradation masks failing components, and rate limiting restricts access when cache and degradation are insufficient.

Rate Limiting Algorithms

Common algorithms fall into three categories:

Counter Algorithm

A simple method that limits the number of requests in a fixed window, e.g., no more than 100 calls per minute.

<!-- https://mvnrepository.com/artifact/com.google.guava/guava --><br/>dependency<br/>groupId: com.google.guava<br/>artifactId: guava<br/>version: 28.1-jre

Leaky Bucket Algorithm

Requests enter a bucket that leaks at a constant rate; excess requests overflow, smoothing traffic spikes.

Token Bucket Algorithm

Tokens are added to a bucket at a steady rate; a request proceeds only if a token is available, allowing controlled bursts.

Concurrent Rate Limiting

Set a global QPS threshold; examples include Tomcat’s acceptCount, maxConnections, and maxThreads, as well as limits on database pools, thread pools, Nginx limit_conn, and rate‑limiting modules.

Limit total concurrency (e.g., DB connection pool)

Limit instantaneous connections (e.g., Nginx limit_conn)

Limit average rate in a time window (e.g., Guava RateLimiter, Nginx limit_req)

Limit remote API calls or MQ consumption

Limit based on network, CPU, or memory load

Interface Rate Limiting

Two aspects: limit total calls in a period (counter algorithm) and use sliding window algorithms for finer granularity.

Sliding Window

Divides a fixed window into smaller slots to achieve smoother, more precise throttling.

Implementation

Guava

LoadingCache<Long, AtomicLong> counter = CacheBuilder.newBuilder()
    .expireAfterWrite(2, TimeUnit.SECONDS)
    .build(new CacheLoader<Long, AtomicLong>() {
        @Override
        public AtomicLong load(Long second) {
            return new AtomicLong(0);
        }
    });
counter.get(1L).incrementAndGet();

Token Bucket with Guava RateLimiter

public static void main(String[] args) {
    RateLimiter limiter = RateLimiter.create(2); // 2 tokens per second
    System.out.println(limiter.acquire());
    Thread.sleep(2000);
    System.out.println(limiter.acquire());
    System.out.println(limiter.acquire());
    System.out.println(limiter.acquire());
    System.out.println(limiter.acquire());
    System.out.println(limiter.acquire());
}

RateLimiter also supports warm‑up mode and timeout acquisition.

boolean tryAcquire = limiter.tryAcquire(Duration.ofMillis(11));

Distributed Rate Limiting with Nginx + Lua

local locks = require "resty.lock"
function acquire()
    local lock = locks:new("locks")
    local elapsed, err = lock:lock("limit_key")
    local limit_counter = ngx.shared.limit_counter
    local key = "ip:" .. os.time()
    local limit = 5
    local current = limit_counter:get(key)
    if current ~= nil and current + 1 > limit then
        lock:unlock()
        return 0
    end
    if current == nil then
        limit_counter:set(key, 1, 1)
    else
        limit_counter:incr(key, 1)
    end
    lock:unlock()
    return 1
end
ngx.print(acquire())

This Lua script uses resty.lock for atomicity and lua_shared_dict to store counters.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed Systemsservice degradationGuavarate limitingToken Bucketcircuit breaker
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.