Mastering Rate Limiting: Strategies, Algorithms, and Real-World Implementations
This article explains why rate limiting is essential for system stability, outlines common throttling patterns such as circuit breaking, degradation, delayed processing, and privilege handling, and dives into popular algorithms like counter, leaky bucket, and token bucket with concrete Java and Nginx examples.
Rate Limiting Concepts
In everyday life, places like tourist attractions limit visitor numbers during holidays to avoid overcrowding and accidents; the same principle applies to web services where sudden traffic spikes can overwhelm a system, so throttling ensures availability.
Throttling Approaches
Circuit Breaking
Design systems to automatically open a circuit when errors persist, rejecting traffic to protect downstream services. When the issue resolves, the circuit closes and normal service resumes. Common tools include Hystrix and Alibaba Sentinel.
Service Degradation
When traffic surges, non‑essential features (e.g., product reviews, points) can be temporarily disabled to free resources for core functions like order processing. After recovery, degraded services are re‑enabled and any missed actions are compensated.
Delayed Processing
Requests are placed into a buffer (e.g., a queue) and processed asynchronously, reducing immediate load on back‑end services. This is the basis for leaky‑bucket and token‑bucket algorithms.
Privilege Handling
Users are classified so that high‑priority groups receive service first, while others may be delayed or rejected.
Difference Between Cache, Degradation, and Throttling
Cache increases throughput and response speed.
Degradation temporarily disables failing components while providing fallback data.
Throttling limits request rates when caching and degradation are insufficient, protecting services before they become unavailable.
Throttling Algorithms
Counter Algorithm
Simple implementation: maintain a counter for a time window (e.g., 100 requests per minute). Increment on each request; reset when the window expires.
Leaky Bucket Algorithm
Requests flow into a bucket that leaks at a constant rate; excess requests overflow and are dropped, smoothing traffic bursts.
Token Bucket Algorithm
Tokens are added to a bucket at a fixed rate; a request proceeds only if a token is available, allowing controlled bursts while enforcing an average rate.
Concurrent Throttling
Set overall QPS limits, e.g., Tomcat's acceptCount, maxConnections, and maxThreads. Other techniques include limiting total concurrency (thread pools, DB connections), instantaneous connections (nginx limit_conn), and average rate within a time window (Guava RateLimiter, nginx limit_req).
API Throttling
Two aspects: total request count within a period (counter algorithm) and sliding window algorithms for finer‑grained control.
Sliding Window
Divides a fixed interval into smaller sub‑windows, providing smoother and more accurate rate limiting compared to a simple fixed window.
Implementation Examples
Guava RateLimiter
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>28.1-jre</version>
</dependency>Core code:
LoadingCache<Long, AtomicLong> counter = CacheBuilder.newBuilder()
.expireAfterWrite(2, TimeUnit.SECONDS)
.build(new CacheLoader<Long, AtomicLong>() {
@Override
public AtomicLong load(Long second) throws Exception {
return new AtomicLong(0);
}
});
counter.get(1L).incrementAndGet();Token Bucket with Guava
public static void main(String[] args) {
// RateLimiter.create(2) generates 2 tokens per second
RateLimiter limiter = RateLimiter.create(2);
System.out.println(limiter.acquire());
Thread.sleep(2000);
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
System.out.println(limiter.acquire());
}Use RateLimiter.create(2, 1000, TimeUnit.MILLISECONDS) for a warm‑up mode.
Distributed Throttling with Nginx + Lua
Lua script uses resty.lock for atomicity and ngx.shared.limit_counter to store counters.
local locks = require "resty.lock"
local function acquire()
local lock = locks:new("locks")
local elapsed, err = lock:lock("limit_key")
local limit_counter = ngx.shared.limit_counter
local key = "ip:" .. os.time()
local limit = 5
local current = limit_counter:get(key)
if current ~= nil and current + 1 > limit then
lock:unlock()
return 0
end
if current == nil then
limit_counter:set(key, 1, 1)
else
limit_counter:incr(key, 1)
end
lock:unlock()
return 1
end
ngx.print(acquire())https://github.com/openresty/lua-resty-lock
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Linux
Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
