Master Rate Limiting: Token & Leaky Buckets, Tomcat, Nginx & OpenResty
This article explains why high‑traffic scenarios like flash‑sale systems need rate limiting, compares token‑bucket and leaky‑bucket algorithms, and shows practical configurations for Tomcat, Nginx, and OpenResty to protect APIs and ensure system stability.
Introduction
In flash‑sale (秒杀) systems, millions of users may compete for a few items, causing massive request bursts that can overwhelm back‑ends. Beyond caching and distributed locks, limiting request rates is essential to preserve resources and maintain service availability.
Rate Limiting
When request volume exceeds a service’s capacity, excess traffic must be rejected or queued. Effective rate limiting prevents system collapse by controlling the flow of incoming requests.
Rate Limiting Algorithms
Common algorithms include the token‑bucket and leaky‑bucket methods.
Token Bucket
The token‑bucket algorithm shapes traffic by allowing a fixed number of tokens to be generated per second; each request consumes a token, and bursts are permitted as long as tokens remain.
Leaky Bucket
The leaky‑bucket algorithm smooths traffic by queuing excess requests and releasing them at a constant rate, rejecting new requests when the bucket is full.
Applying Rate Limiting
Tomcat
Configure a custom thread pool and connection limits in /conf/server.xml to throttle requests. Key parameters include:
name – unique identifier for the shared thread pool.
namePrefix – prefix for thread names (default tomcat‑exec‑).
maxThreads – maximum number of threads (default 200).
maxIdleTime – idle time before a thread is closed (default 60000 ms).
minSpareThreads – minimum idle threads to keep (default 25).
Connector settings such as executor, minProcessors, maxProcessors, and acceptCount further control concurrency.
API Rate Limiting
For sudden spikes during flash sales, the Guava RateLimiter (based on token‑bucket) can be used to limit API calls.
Distributed Rate Limiting
Nginx
Use the Nginx limit module to restrict each IP to a configurable number of requests per second (e.g., 50 r/s) and return a 503 error when the limit is exceeded. Example directives include limit_conn_zone, limit_rate, and burst.
OpenResty
OpenResty provides Lua modules ( resty.limit.count, resty.limit.conn, resty.limit.req) to enforce total concurrent requests, connection‑based limits, and smooth request rates using both token‑bucket and leaky‑bucket logic.
Load Testing
Use ApacheBench (ab) on Linux to benchmark the configured limits and observe response behavior under load.
Conclusion
The presented rate‑limiting techniques—token bucket, leaky bucket, Tomcat thread pools, Nginx limits, and OpenResty Lua modules—offer a toolbox for protecting high‑traffic services; the best choice depends on the specific business scenario.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
