Mastering Rate Limiting: Algorithms, Application, Distributed and Edge Strategies

This article provides a comprehensive guide to rate limiting in high‑concurrency systems, covering core concepts, token‑bucket and leaky‑bucket algorithms, application‑level techniques with Guava, distributed implementations using Redis+Lua and Nginx+Lua, and edge‑layer controls via Nginx modules, complete with configuration examples and test results.

dbaplus Community
dbaplus Community
dbaplus Community
Mastering Rate Limiting: Algorithms, Application, Distributed and Edge Strategies

Why Rate limiting matters

In high‑concurrency systems cache, degradation and rate limiting are the three primary mechanisms for protecting services. Cache increases throughput, degradation shields critical paths when failures occur, and rate limiting controls traffic for scenarios where cache or degradation cannot help, such as flash‑sale spikes, comment posting, or complex queries.

Purpose of rate limiting

Rate limiting throttles either the number of concurrent requests or the number of requests within a fixed time window. When the configured limit is reached the system can reject the request, queue it, or fall back to default data, thereby preventing crashes and enabling graceful degradation.

Common rate‑limiting strategies

Limit total concurrency (e.g., database connection pools, thread pools).

Limit instantaneous concurrency (e.g., Nginx limit_conn module).

Limit average request rate within a time window (e.g., Guava RateLimiter, Nginx limit_req).

Limit remote‑API call rates, message‑queue consumption rates, or rates based on CPU/memory load.

Rate‑limiting algorithms

The two most widely used algorithms are the token bucket and the leaky bucket. A simple counter can also be used for coarse‑grained limits.

Token bucket

Tokens are added to a bucket at a fixed rate (e.g., 2 tokens/s).

The bucket has a maximum capacity b ; excess tokens are discarded.

When a request of size n bytes arrives, n tokens are removed. If insufficient tokens exist the request is throttled.

Token bucket diagram
Token bucket diagram

Leaky bucket

A bucket with fixed capacity releases tokens (or “water drops”) at a constant rate.

If the bucket is empty, no tokens are emitted.

Incoming requests may arrive at any rate; overflow is dropped.

Leaky bucket diagram
Leaky bucket diagram

Token bucket vs. leaky bucket

Token bucket controls the incoming rate and permits bursts while tokens are available.

Leaky bucket smooths the outgoing traffic, discarding excess when the bucket overflows.

Both can be implemented with the same underlying data structure but operate in opposite directions.

Application‑level rate limiting

Application‑level limits protect resources inside a single service instance.

Limiting total concurrency / resources

Configure server connectors (Tomcat acceptCount, maxConnections, maxThreads) or equivalent settings in MySQL, Redis, etc., to cap overall connections.

Per‑API concurrency

Use an AtomicLong in Java to count active requests for a specific endpoint and reject or queue when a threshold is exceeded.

Java AtomicLong example
Java AtomicLong example

Window‑based limits

Store counters in a Guava Cache with a short TTL (e.g., 2 seconds) and use the current second as the key to count requests per second.

Guava cache counter
Guava cache counter

Smooth rate limiting with Guava

Guava’s RateLimiter implements a token‑bucket with two modes:

SmoothBursty : Allows bursts up to the bucket capacity, then smooths traffic to the configured rate.

SmoothWarmingUp : Starts with a higher “warm‑up” rate that gradually settles to the target rate.

Example:

RateLimiter limiter = RateLimiter.create(5); // 5 tokens/s, bucket capacity 5

Calling limiter.acquire() consumes a token; if none are available the call blocks until a token is added.

SmoothBursty example
SmoothBursty example

Distributed rate limiting

When traffic is served by multiple nodes, limits must be enforced atomically across the cluster. Two common approaches are Redis + Lua scripts and Nginx + Lua.

Redis + Lua

A Lua script increments a counter and checks the limit in a single Redis command, guaranteeing atomicity because Redis processes commands sequentially.

Redis Lua script
Redis Lua script

Java can invoke the script via Jedis.eval() (or similar) and decide whether to allow the request.

Java rate‑limit check
Java rate‑limit check

Nginx + Lua

Use lua‑resty‑lock for mutual exclusion and ngx.shared.DICT as a distributed counter. Define shared dictionaries for locks and counters, then run the script in the access_by_lua_block phase.

Nginx Lua script
Nginx Lua script

Edge (access‑layer) rate limiting with Nginx

Nginx provides two built‑in modules:

ngx_http_limit_conn_module – limits total connections per key (IP, domain, etc.).

ngx_http_limit_req_module – implements a token‑bucket to limit request rate per key, supporting delay (smooth) and nodelay (burst) modes.

Connection‑limiting example

Define a shared memory zone and limit each IP to two concurrent connections:

limit_conn_zone $binary_remote_addr zone=addr:10m;
limit_conn addr 2;
limit_conn configuration
limit_conn configuration

Request‑rate limiting example

Limit each IP to 10 requests/s with a burst capacity of 5 and enable burst processing (no delay):

limit_req_zone $binary_remote_addr zone=req:10m rate=10r/s;
limit_req burst=5 nodelay;
limit_req configuration
limit_req configuration

Advanced Lua‑based traffic control

The OpenResty module lua‑resty‑limit‑traffic provides programmable limits that can change keys, rates, and bucket sizes at runtime, extending the capabilities of the native modules.

lua‑resty‑limit‑traffic repository
lua‑resty‑limit‑traffic repository

Key takeaways

Select the algorithm that matches the traffic pattern: token bucket for burst‑tolerant workloads, leaky bucket for strict smoothing.

Application‑level limits protect resources within a single instance; distributed limits are required when traffic is spread across many instances.

Nginx’s native modules cover most edge‑layer scenarios; for dynamic policies augment them with Lua scripts or lua‑resty‑limit‑traffic.

Validate configurations under realistic load and monitor logs to ensure throttling behaves as expected.

References

https://en.wikipedia.org/wiki/Token_bucket

https://en.wikipedia.org/wiki/Leaky_bucket

http://redis.io/commands/incr

http://nginx.org/en/docs/http/ngx_http_limit_req_module.html

http://nginx.org/en/docs/http/ngx_http_limit_conn_module.html

https://github.com/openresty/lua-resty-limit-traffic

http://nginx.org/en/docs/http/ngx_http_core_module.html#limit_rate

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Distributed SystemsGuavarate limitingToken Bucketleaky bucket
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.