Backend Development 14 min read

Mastering Rate Limiting for High‑Traffic Flash‑Sale Systems

This article explains why rate limiting is essential for flash‑sale (seckill) systems, compares token‑bucket and leaky‑bucket algorithms, and provides concrete Tomcat, Nginx, OpenResty, and Guava configurations along with code snippets and load‑testing results to help engineers implement robust throttling.

dbaplus Community

Jun 25, 2018

Mastering Rate Limiting for High‑Traffic Flash‑Sale Systems

Rate Limiting in Flash‑sale (Seckill) Systems

Flash‑sale services often receive millions of concurrent requests while the available inventory is far smaller. Without protection, the backend can be overwhelmed, leading to wasted resources or total service collapse. Rate limiting controls the request flow, rejecting or queuing excess traffic to keep the system stable.

Core Algorithms

Token Bucket

The token‑bucket algorithm issues tokens at a fixed rate (e.g., 5 tokens/s). Each incoming request consumes one token; if the bucket contains tokens, the request is allowed, enabling short bursts up to the bucket capacity (e.g., 20 tokens). When the bucket is empty, further requests are rejected.

Leaky Bucket

The leaky‑bucket algorithm places incoming requests into a FIFO queue that drains at a constant rate. If the queue reaches its maximum size, additional requests are dropped. This smooths bursty traffic into a steady flow.

Layer‑Specific Rate‑Limiting Implementations

Tomcat Thread‑Pool Throttling

Define a shared executor and bind it to the HTTP connector. Adjust the pool size, idle timeout, and request queue length to limit concurrent processing.

<Executor name="tomcatThreadPool" namePrefix="tomcatThreadPool-" maxThreads="1000" maxIdleTime="300000" minSpareThreads="200"/>
<Connector executor="tomcatThreadPool" port="8080" protocol="HTTP/1.1" connectionTimeout="20000" redirectPort="8443" minProcessors="5" maxProcessors="75" acceptCount="1000"/>

maxThreads : maximum number of threads (default 200).

maxIdleTime : time (ms) before an idle thread is terminated (default 60000).

minSpareThreads : minimum idle threads kept alive.

acceptCount : number of requests that can be queued when all threads are busy; excess requests are rejected.

Nginx Rate and Connection Limiting

Use the limit_req_zone and limit_conn_zone modules to restrict request rates per IP and limit concurrent connections.

# Define a rate‑limit zone (50 requests per second per IP)
limit_req_zone $binary_remote_addr $uri zone=api_read:20m rate=50r/s;
# Connection‑limit zones
limit_conn_zone $binary_remote_addr zone=perip_conn:10m;
limit_conn_zone $server_name zone=perserver_conn:100m;
server {
    listen 80;
    server_name seckill.example.com;
    location / {
        limit_req zone=api_read burst=5;      # allow short bursts
        limit_conn perip_conn 2;               # max 2 concurrent connections per IP
        limit_conn perserver_conn 1000;        # max 1000 concurrent connections for the server
        limit_rate 100k;                       # bandwidth limit per connection
        proxy_pass http://seckill_backend;
    }
}
upstream seckill_backend {
    fair;
    server 172.16.1.120:8080 weight=1 max_fails=2 fail_timeout=30s;
    server 172.16.1.130:8080 weight=1 max_fails=2 fail_timeout=30s;
}

burst : size of the token bucket; excess requests beyond the rate are queued up to this number.

limit_conn : caps concurrent connections per IP or per server.

limit_rate : caps bandwidth per TCP connection (e.g., 100 KB/s).

OpenResty (Lua) with lua‑resty‑limit‑traffic

OpenResty bundles the lua‑resty‑limit‑traffic library, which provides token‑bucket, leaky‑bucket, request‑count, and connection‑count limiting directly in Lua scripts. The typical flow is to create a limiter instance in init_by_lua and invoke limit:incoming() in the access phase.

Java API Rate Limiting with Guava

Guava’s RateLimiter implements a token‑bucket. Wrap it in a custom annotation and an AOP aspect to throttle service‑layer methods.

@Target({ElementType.PARAMETER, ElementType.METHOD})
@Retention(RetentionPolicy.RUNTIME)
@Documented
public @interface ServiceLimit {
    String description() default "";
}

@Aspect
@Component
public class LimitAspect {
    private static final RateLimiter limiter = RateLimiter.create(100.0); // 100 permits per second

    @Pointcut("@annotation(com.itstyle.seckill.common.aop.ServiceLimit)")
    public void servicePoint() {}

    @Around("servicePoint()")
    public Object around(ProceedingJoinPoint pjp) throws Throwable {
        if (limiter.tryAcquire()) {
            return pjp.proceed();
        }
        // request rejected when limit exceeded
        return null;
    }
}

// Example usage
@ServiceLimit
@Transactional
public Result startSeckill(long seckillId, long userId) {
    // business logic …
}

Load Testing with ApacheBench (ab)

Install the tool and generate a realistic load to verify the throttling configuration.

yum -y install httpd-tools   # installs ab
ab -n 1000 -c 100 http://127.0.0.1/

Typical output shows ~200 requests/s, average response time ~500 ms, and zero failed requests, indicating that the rate‑limiting settings can handle the simulated traffic.

References

https://github.com/openresty/lua-resty-limit-traffic

https://blog.52itstyle.com/archives/1764/

https://blog.52itstyle.com/archives/775/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

NGINX rate limiting Tomcat Token Bucket OpenResty leaky bucket Seckill

Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.