Backend Development 18 min read

Master Rate Limiting: Token & Leaky Buckets, Distributed Strategies

This article explains how caching, degradation, and especially rate limiting—using token bucket, leaky bucket, and counter‑based methods—protect high‑concurrency systems, covering algorithm basics, application‑level techniques, and distributed implementations with Redis+Lua and Nginx+Lua.

ITFLY8 Architecture Home

Jun 19, 2016

Master Rate Limiting: Token & Leaky Buckets, Distributed Strategies

Rate Limiting Algorithms

Common rate‑limiting algorithms include the token bucket and leaky bucket; a simple counter can also be used for coarse limiting.

Token Bucket Algorithm

The token bucket stores a fixed number of tokens that are added at a constant rate. When a request arrives, it consumes tokens; if insufficient tokens exist, the request is rejected or delayed.

Assume a limit of 2 r/s, adding tokens every 500 ms.

The bucket can hold up to b tokens; excess tokens are discarded.

When a packet of n bytes arrives, n tokens are removed and the packet is sent.

If the bucket has fewer than n tokens, the packet is throttled (dropped or queued).

Leaky Bucket Algorithm

The leaky bucket acts as a meter for traffic shaping and policing. It has a fixed capacity and leaks tokens at a constant rate; incoming bursts are either stored (if space permits) or discarded.

A fixed‑capacity bucket releases tokens at a constant rate.

If the bucket is empty, no tokens are released.

Tokens can be added at any rate.

Overflowing tokens are discarded.

Comparison:

Token bucket adds tokens at a fixed rate and allows bursts while tokens are available.

Leaky bucket releases tokens at a fixed rate, smoothing incoming bursts.

Token bucket limits average inbound rate and permits bursts; leaky bucket limits output rate to smooth traffic.

Simple counter‑based limiting can restrict total concurrency (e.g., database connections, thread pools) by checking a global request count against a threshold.

Application‑Level Rate Limiting

Typical limits include total concurrent connections, total resource count, and per‑API request limits. For example, Tomcat’s acceptCount, maxConnections, and maxThreads control connection queues, maximum simultaneous connections, and thread pool size. Similar settings exist for MySQL ( max_connections) and Redis ( tcp-backlog).

Java code using AtomicLong can implement simple total‑concurrency limiting:

try {
    if (atomic.incrementAndGet() > limit) {
        // reject request
    }
    // process request
} finally {
    atomic.decrementAndGet();
}

Guava’s RateLimiter provides token‑bucket implementations for smooth bursty limiting ( SmoothBursty) and warm‑up limiting ( SmoothWarmingUp).

RateLimiter limiter = RateLimiter.create(5);
System.out.println(limiter.acquire()); // repeat as needed

Using RateLimiter.acquire() consumes tokens; if none are available, the call blocks until a token is added, effectively smoothing request rates.

Distributed Rate Limiting

Distributed limiting requires atomic operations across nodes. Common solutions use Redis+Lua or Nginx+Lua.

Redis+Lua Example

local key = KEYS[1]
local limit = tonumber(ARGV[1])
local current = tonumber(redis.call('get', key) or '0')
if current + 1 > limit then
    return 0
else
    redis.call('INCRBY', key, '1')
    redis.call('expire', key, '2')
    return 1
end

Java integration evaluates the script via Jedis:

public static boolean acquire() throws Exception {
    String luaScript = Files.readString(Paths.get("limit.lua"));
    Jedis jedis = new Jedis("192.168.147.52", 6379);
    String key = "ip:" + System.currentTimeMillis() / 1000;
    String limit = "3";
    return (Long) jedis.eval(luaScript, Collections.singletonList(key), Collections.singletonList(limit)) == 1;
}

Nginx+Lua Example

local locks = require "resty.lock"
local function acquire()
    local lock = locks:new("locks")
    lock:lock("limit_key")
    local limit_counter = ngx.shared.limit_counter
    local key = "ip:" .. os.time()
    local limit = 5
    local current = limit_counter:get(key)
    if current and current + 1 > limit then
        lock:unlock()
        return 0
    end
    if not current then
        limit_counter:set(key, 1, 1)
    else
        limit_counter:incr(key, 1)
    end
    lock:unlock()
    return 1
end
ngx.print(acquire())

Both approaches ensure atomicity (Redis Lua runs single‑threaded; Nginx Lua uses resty.lock and shared dictionaries).

When traffic is extremely high, consider sharding the distributed limiter via consistent hashing or falling back to application‑level limiting. Large e‑commerce platforms (e.g., JD.com) successfully use Redis+Lua for flash‑sale throttling.

For entry‑point limiting, it is recommended to handle it at the access layer (e.g., Nginx) rather than within the application.

Reference Materials:

https://en.wikipedia.org/wiki/Token_bucket

https://en.wikipedia.org/wiki/Leaky_bucket

http://redis.io/commands/incr

http://nginx.org/en/docs/http/ngx_http_limit_req_module.html

http://nginx.org/en/docs/http/ngx_http_limit_conn_module.html

https://github.com/openresty/lua-resty-limit-traffic

http://nginx.org/en/docs/http/ngx_http_core_module.html#limit_rate

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

distributed systems rate limiting Token Bucket Leaky Bucket

Written by

ITFLY8 Architecture Home

ITFLY8 Architecture Home - focused on architecture knowledge sharing and exchange, covering project management and product design. Includes large-scale distributed website architecture (high performance, high availability, caching, message queues...), design patterns, architecture patterns, big data, project management (SCRUM, PMP, Prince2), product design, and more.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.