Backend Development 28 min read

Mastering Rate Limiting: Algorithms, Application, Distributed and Edge Strategies

This article provides a comprehensive guide to rate limiting in high‑concurrency systems, covering core concepts, token‑bucket and leaky‑bucket algorithms, application‑level techniques with Guava, distributed implementations using Redis+Lua and Nginx+Lua, and edge‑layer controls via Nginx modules, complete with configuration examples and test results.

dbaplus Community

Jun 23, 2016

Mastering Rate Limiting: Algorithms, Application, Distributed and Edge Strategies

Why Rate limiting matters

In high‑concurrency systems cache, degradation and rate limiting are the three primary mechanisms for protecting services. Cache increases throughput, degradation shields critical paths when failures occur, and rate limiting controls traffic for scenarios where cache or degradation cannot help, such as flash‑sale spikes, comment posting, or complex queries.

Purpose of rate limiting

Rate limiting throttles either the number of concurrent requests or the number of requests within a fixed time window. When the configured limit is reached the system can reject the request, queue it, or fall back to default data, thereby preventing crashes and enabling graceful degradation.

Common rate‑limiting strategies

Limit total concurrency (e.g., database connection pools, thread pools).

Limit instantaneous concurrency (e.g., Nginx limit_conn module).

Limit average request rate within a time window (e.g., Guava RateLimiter, Nginx limit_req).

Limit remote‑API call rates, message‑queue consumption rates, or rates based on CPU/memory load.

Rate‑limiting algorithms

The two most widely used algorithms are the token bucket and the leaky bucket. A simple counter can also be used for coarse‑grained limits.

Token bucket

Tokens are added to a bucket at a fixed rate (e.g., 2 tokens/s).

The bucket has a maximum capacity b ; excess tokens are discarded.

When a request of size n bytes arrives, n tokens are removed. If insufficient tokens exist the request is throttled.

Leaky bucket

A bucket with fixed capacity releases tokens (or “water drops”) at a constant rate.

If the bucket is empty, no tokens are emitted.

Incoming requests may arrive at any rate; overflow is dropped.

Token bucket vs. leaky bucket

Token bucket controls the incoming rate and permits bursts while tokens are available.

Leaky bucket smooths the outgoing traffic, discarding excess when the bucket overflows.

Both can be implemented with the same underlying data structure but operate in opposite directions.

Application‑level rate limiting

Application‑level limits protect resources inside a single service instance.

Limiting total concurrency / resources

Configure server connectors (Tomcat acceptCount, maxConnections, maxThreads) or equivalent settings in MySQL, Redis, etc., to cap overall connections.

Per‑API concurrency

Use an AtomicLong in Java to count active requests for a specific endpoint and reject or queue when a threshold is exceeded.

Window‑based limits

Store counters in a Guava Cache with a short TTL (e.g., 2 seconds) and use the current second as the key to count requests per second.

Smooth rate limiting with Guava

Guava’s RateLimiter implements a token‑bucket with two modes:

SmoothBursty : Allows bursts up to the bucket capacity, then smooths traffic to the configured rate.

SmoothWarmingUp : Starts with a higher “warm‑up” rate that gradually settles to the target rate.

Example:

RateLimiter limiter = RateLimiter.create(5); // 5 tokens/s, bucket capacity 5

Calling limiter.acquire() consumes a token; if none are available the call blocks until a token is added.

Distributed rate limiting

When traffic is served by multiple nodes, limits must be enforced atomically across the cluster. Two common approaches are Redis + Lua scripts and Nginx + Lua.

Redis + Lua

A Lua script increments a counter and checks the limit in a single Redis command, guaranteeing atomicity because Redis processes commands sequentially.

Java can invoke the script via Jedis.eval() (or similar) and decide whether to allow the request.

Nginx + Lua

Use lua‑resty‑lock for mutual exclusion and ngx.shared.DICT as a distributed counter. Define shared dictionaries for locks and counters, then run the script in the access_by_lua_block phase.

Edge (access‑layer) rate limiting with Nginx

Nginx provides two built‑in modules:

ngx_http_limit_conn_module – limits total connections per key (IP, domain, etc.).

ngx_http_limit_req_module – implements a token‑bucket to limit request rate per key, supporting delay (smooth) and nodelay (burst) modes.

Connection‑limiting example

Define a shared memory zone and limit each IP to two concurrent connections:

limit_conn_zone $binary_remote_addr zone=addr:10m;

limit_conn addr 2;

Request‑rate limiting example

Limit each IP to 10 requests/s with a burst capacity of 5 and enable burst processing (no delay):

limit_req_zone $binary_remote_addr zone=req:10m rate=10r/s;

limit_req burst=5 nodelay;

Advanced Lua‑based traffic control

The OpenResty module lua‑resty‑limit‑traffic provides programmable limits that can change keys, rates, and bucket sizes at runtime, extending the capabilities of the native modules.

Key takeaways

Select the algorithm that matches the traffic pattern: token bucket for burst‑tolerant workloads, leaky bucket for strict smoothing.

Application‑level limits protect resources within a single instance; distributed limits are required when traffic is spread across many instances.

Nginx’s native modules cover most edge‑layer scenarios; for dynamic policies augment them with Lua scripts or lua‑resty‑limit‑traffic.

Validate configurations under realistic load and monitor logs to ensure throttling behaves as expected.

References

https://en.wikipedia.org/wiki/Token_bucket

https://en.wikipedia.org/wiki/Leaky_bucket

http://redis.io/commands/incr

http://nginx.org/en/docs/http/ngx_http_limit_req_module.html

http://nginx.org/en/docs/http/ngx_http_limit_conn_module.html

https://github.com/openresty/lua-resty-limit-traffic

http://nginx.org/en/docs/http/ngx_http_core_module.html#limit_rate

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems Guava rate limiting Token Bucket leaky bucket

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Why Rate limiting matters

Purpose of rate limiting

Common rate‑limiting strategies

Rate‑limiting algorithms

Token bucket

Leaky bucket

Token bucket vs. leaky bucket

Application‑level rate limiting

Limiting total concurrency / resources

Per‑API concurrency

Window‑based limits

Smooth rate limiting with Guava

Distributed rate limiting

Redis + Lua

Nginx + Lua

Edge (access‑layer) rate limiting with Nginx

Connection‑limiting example

Request‑rate limiting example

Advanced Lua‑based traffic control

Key takeaways

References

dbaplus Community

How this landed with the community

Was this worth your time?

0 Comments

Redis + Lua

Nginx + Lua