Backend Development 12 min read

Rate Limiting Strategies for High‑Concurrency Seckill Systems

This article explains why rate limiting is essential for large‑scale flash‑sale (seckill) services, introduces token‑bucket and leaky‑bucket algorithms, and demonstrates practical implementations using Tomcat thread pools, Nginx, OpenResty, and Guava RateLimiter together with stress‑testing commands.

Architecture Digest

Jun 21, 2018

Rate Limiting Strategies for High‑Concurrency Seckill Systems

In high‑traffic flash‑sale (seckill) scenarios, millions of users may compete for a limited number of items, causing request volumes far exceeding the system’s capacity. Simply queuing or caching every request would waste resources, so a dedicated rate‑limiting layer is required to protect backend services.

Rate‑limiting concepts – The article describes two classic algorithms: the token‑bucket, which allows bursts up to a configured bucket size while enforcing a steady token‑generation rate, and the leaky‑bucket, which smooths traffic by queuing excess requests and dropping them when the queue is full.

Tomcat implementation – By defining a custom

<Executor name="tomcatThreadPool" namePrefix="tomcatThreadPool-" maxThreads="1000" maxIdleTime="300000" minSpareThreads="200"/>

in conf/server.xml and attaching it to a <Connector executor="tomcatThreadPool" .../>, the server can limit the number of processing threads, idle time, and connection backlog.

Nginx implementation – Using the limit_req_zone and limit_conn_zone directives, the configuration limits each IP to 50 requests per second, allows a burst of 5, caps concurrent connections per IP to 2, and sets a global connection limit of 1000. Example snippet:

limit_req_zone $binary_remote_addr $uri zone=api_read:20m rate=50r/s;
limit_conn_zone $binary_remote_addr zone=perip_conn:10m;
server {
    listen 80;
    location / {
        limit_req zone=api_read burst=5;
        limit_conn perip_conn 2;
        limit_conn perserver_conn 1000;
        limit_rate 100k;
        proxy_pass http://seckill;
    }
}

OpenResty (Lua) implementation – The article shows how to use the lua‑resty‑limit‑traffic library. The resty.limit.count module limits total concurrent requests, resty.limit.conn limits per‑client connections, and resty.limit.req provides both token‑bucket and leaky‑bucket smoothing.

API‑level limiting with Guava – By annotating service methods with a custom @ServiceLimit annotation and applying an AOP aspect that uses RateLimiter.create(100.0), the application enforces a per‑process limit of 100 requests per second.

Stress testing – The article recommends ApacheBench (ab) for load testing, showing installation commands and an example test: ab -n 1000 -c 100 http://127.0.0.1/, followed by a sample result table that includes requests per second, average latency, and connection time statistics.

Conclusion – Various rate‑limiting techniques are presented; the choice depends on the specific business scenario rather than a universal “best” solution.

References – Links to the OpenResty limit‑traffic repository and related blog posts are provided.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

NGINX rate limiting tomcat Seckill

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.