Backend Development 10 min read

Master Rate Limiting: Token & Leaky Buckets, Tomcat, Nginx & OpenResty

This article explains why high‑traffic scenarios like flash‑sale systems need rate limiting, compares token‑bucket and leaky‑bucket algorithms, and shows practical configurations for Tomcat, Nginx, and OpenResty to protect APIs and ensure system stability.

21CTO

Jun 16, 2018

Master Rate Limiting: Token & Leaky Buckets, Tomcat, Nginx & OpenResty

Introduction

In flash‑sale (秒杀) systems, millions of users may compete for a few items, causing massive request bursts that can overwhelm back‑ends. Beyond caching and distributed locks, limiting request rates is essential to preserve resources and maintain service availability.

Rate Limiting

When request volume exceeds a service’s capacity, excess traffic must be rejected or queued. Effective rate limiting prevents system collapse by controlling the flow of incoming requests.

Rate Limiting Algorithms

Common algorithms include the token‑bucket and leaky‑bucket methods.

Token Bucket

The token‑bucket algorithm shapes traffic by allowing a fixed number of tokens to be generated per second; each request consumes a token, and bursts are permitted as long as tokens remain.

Leaky Bucket

The leaky‑bucket algorithm smooths traffic by queuing excess requests and releasing them at a constant rate, rejecting new requests when the bucket is full.

Applying Rate Limiting

Tomcat

Configure a custom thread pool and connection limits in /conf/server.xml to throttle requests. Key parameters include:

name – unique identifier for the shared thread pool.

namePrefix – prefix for thread names (default tomcat‑exec‑).

maxThreads – maximum number of threads (default 200).

maxIdleTime – idle time before a thread is closed (default 60000 ms).

minSpareThreads – minimum idle threads to keep (default 25).

Connector settings such as executor, minProcessors, maxProcessors, and acceptCount further control concurrency.

API Rate Limiting

For sudden spikes during flash sales, the Guava RateLimiter (based on token‑bucket) can be used to limit API calls.

Distributed Rate Limiting

Nginx

Use the Nginx limit module to restrict each IP to a configurable number of requests per second (e.g., 50 r/s) and return a 503 error when the limit is exceeded. Example directives include limit_conn_zone, limit_rate, and burst.

OpenResty

OpenResty provides Lua modules ( resty.limit.count, resty.limit.conn, resty.limit.req) to enforce total concurrent requests, connection‑based limits, and smooth request rates using both token‑bucket and leaky‑bucket logic.

Load Testing

Use ApacheBench (ab) on Linux to benchmark the configured limits and observe response behavior under load.

Conclusion

The presented rate‑limiting techniques—token bucket, leaky bucket, Tomcat thread pools, Nginx limits, and OpenResty Lua modules—offer a toolbox for protecting high‑traffic services; the best choice depends on the specific business scenario.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Development Nginx rate limiting tomcat Token Bucket OpenResty Leaky Bucket

Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.