Mastering Rate Limiting for High‑Traffic Flash‑Sale Systems
This article explains why rate limiting is essential for flash‑sale (seckill) systems, compares token‑bucket and leaky‑bucket algorithms, and provides concrete Tomcat, Nginx, OpenResty, and Guava configurations along with code snippets and load‑testing results to help engineers implement robust throttling.
Rate Limiting in Flash‑sale (Seckill) Systems
Flash‑sale services often receive millions of concurrent requests while the available inventory is far smaller. Without protection, the backend can be overwhelmed, leading to wasted resources or total service collapse. Rate limiting controls the request flow, rejecting or queuing excess traffic to keep the system stable.
Core Algorithms
Token Bucket
The token‑bucket algorithm issues tokens at a fixed rate (e.g., 5 tokens/s). Each incoming request consumes one token; if the bucket contains tokens, the request is allowed, enabling short bursts up to the bucket capacity (e.g., 20 tokens). When the bucket is empty, further requests are rejected.
Leaky Bucket
The leaky‑bucket algorithm places incoming requests into a FIFO queue that drains at a constant rate. If the queue reaches its maximum size, additional requests are dropped. This smooths bursty traffic into a steady flow.
Layer‑Specific Rate‑Limiting Implementations
Tomcat Thread‑Pool Throttling
Define a shared executor and bind it to the HTTP connector. Adjust the pool size, idle timeout, and request queue length to limit concurrent processing.
<Executor name="tomcatThreadPool" namePrefix="tomcatThreadPool-" maxThreads="1000" maxIdleTime="300000" minSpareThreads="200"/>
<Connector executor="tomcatThreadPool" port="8080" protocol="HTTP/1.1" connectionTimeout="20000" redirectPort="8443" minProcessors="5" maxProcessors="75" acceptCount="1000"/>maxThreads : maximum number of threads (default 200).
maxIdleTime : time (ms) before an idle thread is terminated (default 60000).
minSpareThreads : minimum idle threads kept alive.
acceptCount : number of requests that can be queued when all threads are busy; excess requests are rejected.
Nginx Rate and Connection Limiting
Use the limit_req_zone and limit_conn_zone modules to restrict request rates per IP and limit concurrent connections.
# Define a rate‑limit zone (50 requests per second per IP)
limit_req_zone $binary_remote_addr $uri zone=api_read:20m rate=50r/s;
# Connection‑limit zones
limit_conn_zone $binary_remote_addr zone=perip_conn:10m;
limit_conn_zone $server_name zone=perserver_conn:100m;
server {
listen 80;
server_name seckill.example.com;
location / {
limit_req zone=api_read burst=5; # allow short bursts
limit_conn perip_conn 2; # max 2 concurrent connections per IP
limit_conn perserver_conn 1000; # max 1000 concurrent connections for the server
limit_rate 100k; # bandwidth limit per connection
proxy_pass http://seckill_backend;
}
}
upstream seckill_backend {
fair;
server 172.16.1.120:8080 weight=1 max_fails=2 fail_timeout=30s;
server 172.16.1.130:8080 weight=1 max_fails=2 fail_timeout=30s;
}burst : size of the token bucket; excess requests beyond the rate are queued up to this number.
limit_conn : caps concurrent connections per IP or per server.
limit_rate : caps bandwidth per TCP connection (e.g., 100 KB/s).
OpenResty (Lua) with lua‑resty‑limit‑traffic
OpenResty bundles the lua‑resty‑limit‑traffic library, which provides token‑bucket, leaky‑bucket, request‑count, and connection‑count limiting directly in Lua scripts. The typical flow is to create a limiter instance in init_by_lua and invoke limit:incoming() in the access phase.
Java API Rate Limiting with Guava
Guava’s RateLimiter implements a token‑bucket. Wrap it in a custom annotation and an AOP aspect to throttle service‑layer methods.
@Target({ElementType.PARAMETER, ElementType.METHOD})
@Retention(RetentionPolicy.RUNTIME)
@Documented
public @interface ServiceLimit {
String description() default "";
}
@Aspect
@Component
public class LimitAspect {
private static final RateLimiter limiter = RateLimiter.create(100.0); // 100 permits per second
@Pointcut("@annotation(com.itstyle.seckill.common.aop.ServiceLimit)")
public void servicePoint() {}
@Around("servicePoint()")
public Object around(ProceedingJoinPoint pjp) throws Throwable {
if (limiter.tryAcquire()) {
return pjp.proceed();
}
// request rejected when limit exceeded
return null;
}
}
// Example usage
@ServiceLimit
@Transactional
public Result startSeckill(long seckillId, long userId) {
// business logic …
}Load Testing with ApacheBench (ab)
Install the tool and generate a realistic load to verify the throttling configuration.
yum -y install httpd-tools # installs ab
ab -n 1000 -c 100 http://127.0.0.1/Typical output shows ~200 requests/s, average response time ~500 ms, and zero failed requests, indicating that the rate‑limiting settings can handle the simulated traffic.
References
https://github.com/openresty/lua-resty-limit-traffic
https://blog.52itstyle.com/archives/1764/
https://blog.52itstyle.com/archives/775/
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
dbaplus Community
Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
