Rate Limiting Strategies, Algorithms, and Implementations in Backend Systems
This article explains the concepts, strategies, and algorithms of rate limiting—including circuit breaking, service degradation, leaky‑bucket and token‑bucket methods—and provides practical Java, Guava, and Nginx + Lua implementations for controlling concurrency and protecting backend services.
In daily life and online services, traffic spikes can overwhelm resources, so rate limiting is essential to maintain availability. The article discusses why limiting traffic is necessary and draws parallels between physical crowd control and network request throttling.
Rate Limiting Approaches
Circuit Breaking
When a system cannot recover quickly, a circuit breaker automatically rejects traffic to prevent overload. Tools like Hystrix and Alibaba Sentinel are mentioned.
Service Degradation
Non‑critical features (e.g., comments, points) can be temporarily disabled during traffic surges to free resources for core functions.
Delay Processing
Requests are buffered in a queue (e.g., a leaky‑bucket) and processed later, reducing immediate load at the cost of latency.
Privilege Processing
Requests are prioritized based on user classification, ensuring high‑priority users receive service first.
Rate Limiting Algorithms
Counter Algorithm
Simple fixed‑window counting, e.g., allowing at most 100 requests per minute for an API.
LoadingCache<Long, AtomicLong> counter = CacheBuilder.newBuilder()
.expireAfterWrite(2, TimeUnit.SECONDS)
.build(new CacheLoader<Long, AtomicLong>() {
@Override
public AtomicLong load(Long second) throws Exception {
return new AtomicLong(0);
}
});
counter.get(1L).incrementAndGet();Leaky Bucket Algorithm
Requests enter a bucket that leaks at a constant rate; excess requests overflow and are dropped, providing a smooth outflow.
Token Bucket Algorithm
Tokens are added to a bucket at a steady rate; a request proceeds only if a token is available, allowing bursts while enforcing an average rate.
public static void main(String[] args) {
// RateLimiter.create(2) generates 2 tokens per second
RateLimiter limiter = RateLimiter.create(2);
System.out.println(limiter.acquire());
try { Thread.sleep(2000); } catch (InterruptedException e) { e.printStackTrace(); }
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
}Concurrency Limiting
Typical configuration parameters in servers (e.g., Tomcat) include acceptCount, maxConnections, and maxThreads to cap total or instantaneous connections.
Limit total concurrency (e.g., database connection pool, thread pool)
Limit instantaneous connections (e.g., Nginx limit_conn)
Limit average rate within a time window (e.g., Guava RateLimiter, Nginx limit_req)
Limit remote API call rate, MQ consumption rate, etc.
Interface Limiting
Two aspects: total number of calls within a period (counter algorithm) and sliding window algorithms for finer granularity.
Implementation Examples
Guava Implementation
Include the Guava dependency and use RateLimiter for smooth token‑bucket rate limiting.
<!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>28.1-jre</version>
</dependency>Distributed Limiting with Nginx + Lua
Use resty.lock for atomic operations and lua_shared_dict to store counters.
local locks = require "resty.lock"
local function acquire()
local lock = locks:new("locks")
local elapsed, err = lock:lock("limit_key")
local limit_counter = ngx.shared.limit_counter
local key = "ip:" .. os.time()
local limit = 5
local current = limit_counter:get(key)
if current ~= nil and current + 1 > limit then
lock:unlock()
return 0
end
if current == nil then
limit_counter:set(key, 1, 1)
else
limit_counter:incr(key, 1)
end
lock:unlock()
return 1
end
ngx.print(acquire())These examples illustrate how to protect services from overload by applying appropriate rate‑limiting techniques.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
