Mastering Rate Limiting: Strategies, Algorithms, and Real‑World Implementations
This article explains the concept of rate limiting through real‑world analogies, outlines common throttling strategies such as circuit breaking, service degradation, delayed and privileged processing, compares key algorithms like counter, leaky‑bucket and token‑bucket, and provides practical Guava, token‑bucket and Nginx‑Lua code examples for both single‑node and distributed systems.
Rate‑Limiting Concepts
In everyday life we encounter situations that require limiting flow, such as crowded tourist attractions during holidays. The same principle applies to online systems: when traffic spikes beyond capacity, a rate‑limiting rule ensures the service remains usable and prevents crashes.
Throttling Strategies
Circuit Breaker
Design systems with built‑in circuit‑breaker mechanisms that automatically reject traffic when a problem cannot be quickly resolved, protecting backend services from overload. Once the issue is fixed, the circuit can be closed to resume normal operation. Common components include Hystrix and Alibaba Sentinel.
Service Degradation
Classify services by importance and temporarily disable non‑critical features during emergencies to free resources for core functions, such as suspending product reviews or points systems in an e‑commerce platform.
Delay Processing
Introduce a front‑end buffer (e.g., a queue) that temporarily holds incoming requests, allowing the backend to process them at a controlled rate, which is the basis of leaky‑bucket and token‑bucket algorithms.
Privilege Processing
Prioritize certain user groups by granting them higher service guarantees while delaying or rejecting requests from other users.
Cache, Degradation, and Rate Limiting Differences
Cache increases throughput, degradation temporarily shields the system when components fail, and rate limiting restricts request rates when caching and degradation are insufficient.
Rate‑Limiting Algorithms
Counter Algorithm
A straightforward method that counts requests within a fixed window (e.g., no more than 100 calls per minute). When the count exceeds the limit, further requests are rejected until the window resets.
Leaky‑Bucket Algorithm
Requests enter a bucket that leaks at a constant rate; excess requests overflow and are dropped, smoothing traffic spikes.
Token‑Bucket Algorithm
Tokens are added to a bucket at a steady rate; each request consumes a token. If no token is available, the request is rejected. This allows short bursts while enforcing an average rate.
Concurrent Rate Limiting
Typical configurations set a global QPS threshold. For example, Tomcat parameters:
acceptCount – maximum pending connections
maxConnections – maximum simultaneous connections
maxThreads – maximum thread pool size
Other concurrency limits include:
Limiting total concurrency (e.g., database or thread pools)
Limiting instantaneous connections (e.g., Nginx limit_conn)
Limiting average rate within a time window (e.g., Guava RateLimiter, Nginx limit_req)
Limiting remote API call rates or MQ consumption rates
Limiting based on network, CPU, or memory load
Properly applied, concurrency limiting protects services from sudden overloads while preserving user experience.
Interface Rate Limiting
Total Call Count
Restrict the total number of calls to an interface within a given period using the counter algorithm.
Sliding Time Window
Divide the monitoring interval into finer sub‑windows (e.g., milliseconds) to achieve more precise rate control, avoiding the inaccuracies of fixed‑window counting.
Implementation Examples
Guava Implementation
<!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>28.1-jre</version>
</dependency>Core code:
LoadingCache<Long, AtomicLong> counter = CacheBuilder.newBuilder()
.expireAfterWrite(2, TimeUnit.SECONDS)
.build(new CacheLoader<Long, AtomicLong>() {
@Override
public AtomicLong load(Long second) throws Exception {
return new AtomicLong(0);
}
});
counter.get(1L).incrementAndGet();Token‑Bucket Implementation (Guava RateLimiter)
Smooth burst mode (constant token generation):
public static void main(String[] args) {
// RateLimiter.create(2) – 2 tokens per second
RateLimiter limiter = RateLimiter.create(2);
System.out.println(limiter.acquire());
try { Thread.sleep(2000); } catch (InterruptedException e) { e.printStackTrace(); }
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
}Warm‑up mode (gradual token increase):
RateLimiter limiter = RateLimiter.create(2, 1000L, TimeUnit.MILLISECONDS);
System.out.println(limiter.acquire());
try { Thread.sleep(2000); } catch (InterruptedException e) { e.printStackTrace(); }
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());Timeout check:
boolean tryAcquire = limiter.tryAcquire(Duration.ofMillis(11));Distributed Rate Limiting with Nginx + Lua
Use resty.lock for atomicity and lua_shared_dict for counters.
https://github.com/openresty/lua-resty-lock
local locks = require "resty.lock"
local function acquire()
local lock = locks:new("locks")
local elapsed, err = lock:lock("limit_key")
local limit_counter = ngx.shared.limit_counter
local key = "ip:" .. os.time()
local limit = 5
local current = limit_counter:get(key)
if current ~= nil and current + 1 > limit then
lock:unlock()
return 0
end
if current == nil then
limit_counter:set(key, 1, 1)
else
limit_counter:incr(key, 1)
end
lock:unlock()
return 1
end
ngx.print(acquire())Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java High-Performance Architecture
Sharing Java development articles and resources, including SSM architecture and the Spring ecosystem (Spring Boot, Spring Cloud, MyBatis, Dubbo, Docker), Zookeeper, Redis, architecture design, microservices, message queues, Git, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
