Understanding Rate Limiting: Concepts, Algorithms, and Practical Implementations
This article explains why rate limiting is essential for both physical venues and online services, describes common strategies such as circuit breaking, service degradation, delay handling, and privilege handling, compares caching, degradation, and limiting, and details counter, leaky‑bucket, and token‑bucket algorithms with concrete Guava and Nginx‑Lua implementations.
Why Rate Limiting
In everyday life, places like tourist attractions limit the number of visitors to avoid overcrowding, accidents, and poor experience; the same principle applies to online systems where sudden traffic spikes can overwhelm servers, so limiting traffic preserves availability.
Rate Limiting Approaches
Circuit Breaker
When a system detects unrecoverable errors, it automatically opens a circuit to reject traffic, preventing overload; once the backend stabilises, the circuit closes. Common tools include Hystrix and Alibaba Sentinel.
Service Degradation
Non‑critical features are temporarily disabled during high load, freeing resources for core functions. For example, an e‑commerce site may disable comments or points during a traffic surge.
Delay Handling
Requests are buffered in a queue (a leaky‑bucket style) and processed sequentially, smoothing spikes but potentially adding latency when the buffer overflows.
Privilege Handling
Users are classified, allowing high‑priority groups to receive service while others are delayed or rejected.
Difference Between Cache, Degradation, and Rate Limiting
Cache increases throughput and speeds up access; Degradation temporarily shields failing components and returns fallback data; Rate Limiting restricts request frequency when caching and degradation are insufficient, protecting the service before it becomes unavailable.
Rate Limiting Algorithms
Counter Algorithm
A simple method that counts requests within a fixed window (e.g., 100 requests per minute) and rejects excess traffic. It can be implemented by setting limits on thread pools, database connections, or Nginx connections.
Leaky Bucket Algorithm
Requests enter a bucket and are released at a constant rate; if the bucket overflows, excess requests are dropped, effectively smoothing bursts and protecting downstream services.
Token Bucket Algorithm
Tokens are added to a bucket at a steady rate; a request proceeds only if a token is available, allowing controlled bursts while still enforcing an average rate.
Concurrent Rate Limiting
Limits can be applied to total concurrency (e.g., database connection pools), instantaneous connections (e.g., Nginx limit_conn), average rate within a time window (e.g., Guava RateLimiter or Nginx limit_req), remote API calls, or MQ consumption.
Limit total concurrency (database pool, thread pool)
Limit instantaneous connections (Nginx limit_conn)
Limit average QPS (Guava RateLimiter, Nginx limit_req)
Limit remote API or MQ consumption rates
Adjust limits based on CPU, memory, or network usage
Interface Rate Limiting
Total Calls
Restricts the number of times an API can be invoked within a given period, typically using the counter algorithm.
Sliding Window
Divides the time window into smaller slots to achieve finer‑grained counting, reducing the inaccuracy of fixed windows and handling bursty traffic more precisely.
Implementation
Guava Implementation
Dependency:
<!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>28.1-jre</version>
</dependency>Core code:
LoadingCache<Long, AtomicLong> counter = CacheBuilder.newBuilder()
.expireAfterWrite(2, TimeUnit.SECONDS)
.build(new CacheLoader<Long, AtomicLong>() {
@Override
public AtomicLong load(Long second) throws Exception {
return new AtomicLong(0);
}
});
counter.get(1L).incrementAndGet();Token Bucket Implementation
SmoothBursty (constant token generation):
public static void main(String[] args) {
// RateLimiter.create(2) generates 2 tokens per second
RateLimiter limiter = RateLimiter.create(2);
System.out.println(limiter.acquire());
Thread.sleep(2000);
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
}SmoothWarmingUp (gradual increase to stable rate):
RateLimiter limiter = RateLimiter.create(2, 1000L, TimeUnit.MILLISECONDS);
System.out.println(limiter.acquire());
Thread.sleep(2000);
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());Timeout check:
boolean tryAcquire = limiter.tryAcquire(Duration.ofMillis(11));Distributed Rate Limiting with Nginx + Lua
Uses resty.lock for atomicity and lua_shared_dict for counters. Example Lua code:
local locks = require "resty.lock"
local function acquire()
local lock = locks:new("locks")
local elapsed, err = lock:lock("limit_key") -- mutex lock
local limit_counter = ngx.shared.limit_counter
local key = "ip:" .. os.time()
local limit = 5
local current = limit_counter:get(key)
if current ~= nil and current + 1 > limit then
lock:unlock()
return 0
end
if current == nil then
limit_counter:set(key, 1, 1) -- first hit, set 1‑second TTL
else
limit_counter:incr(key, 1)
end
lock:unlock()
return 1
end
ngx.print(acquire())These snippets illustrate how to apply rate limiting in both single‑node Java services and distributed Nginx‑Lua environments.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect's Guide
Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
