Mastering Rate Limiting: 6 Practical Techniques from Tomcat to Redis
This article explains the concept of rate limiting, classifies its types, and provides six concrete implementation methods—including Tomcat thread limits, Nginx rate and connection controls, and server‑side algorithms using Redis and Guava—complete with configuration examples and code snippets.
Rate Limiting Classification
Rate limiting can be divided into three categories:
Legality verification limiting : e.g., captcha, IP blacklist.
Container limiting : e.g., Tomcat, Nginx.
Server‑side limiting : algorithm‑based implementation, the focus of this article.
Container Limiting
Tomcat Limiting
Tomcat 8.5 configures the maximum thread count in conf/server.xml:
<Connector port="8080" protocol="HTTP/1.1"
connectionTimeout="20000"
maxThreads="150"
redirectPort="8443" />The maxThreads parameter defines how many concurrent requests can be processed; excess requests are queued, achieving rate limiting.
Tip: Increasing maxThreads consumes more JVM memory and may hit OS thread limits (Windows ~2000, Linux ~1000).
Nginx Limiting
Nginx offers two limiting methods: rate limiting and concurrent connection limiting.
Rate Limiting
Use limit_req_zone to define a request rate per IP:
limit_req_zone $binary_remote_addr zone=mylimit:10m rate=2r/s;
server {
location / {
limit_req zone=mylimit;
}
}This configuration allows 2 requests per second per IP (equivalent to 1 request per 500 ms).
Example test (6 requests within 10 ms) shows only the first request succeeds.
Rate Limiting with Burst
Adding burst allows a short burst of requests:
limit_req_zone $binary_remote_addr zone=mylimit:10m rate=2r/s;
server {
location / {
limit_req zone=mylimit burst=4;
}
}In a 10 ms window with 6 requests, 1 is processed immediately, 4 are queued, and 1 is rejected.
Concurrent Connection Limiting
Use limit_conn_zone and limit_conn to restrict simultaneous connections:
limit_conn_zone $binary_remote_addr zone=perip:10m;
limit_conn_zone $server_name zone=perserver:10m;
server {
...
limit_conn perip 10;
limit_conn perserver 100;
}This limits each IP to 10 concurrent connections and the server to 100 total concurrent connections.
Note: Only connections that reach the backend are counted.
Server‑Side Limiting
Server‑side limiting relies on algorithms that act as the “brain” of the limit.
Common algorithms:
Time Window Algorithm
Leaky Bucket Algorithm
Token Bucket Algorithm
1. Time Window Algorithm
The sliding window keeps a record of request timestamps (e.g., last 60 s, max 100 requests). Older records are removed, and the current count is compared to the limit.
Implementation using Redis ZSet:
<!-- Redis dependency -->
<dependency>
<groupId>redis.clients</groupId>
<artifactId>jedis</artifactId>
<version>3.3.0</version>
</dependency> import redis.clients.jedis.Jedis;
public class RedisLimit {
static Jedis jedis = new Jedis("127.0.0.1", 6379);
public static void main(String[] args) throws InterruptedException {
for (int i = 0; i < 15; i++) {
boolean res = isPeriodLimiting("java", 3, 10);
System.out.println(res ? "Normal request: " + i : "Limited: " + i);
}
Thread.sleep(4000);
boolean res = isPeriodLimiting("java", 3, 10);
System.out.println(res ? "After sleep, normal request" : "After sleep, limited");
}
private static boolean isPeriodLimiting(String key, int period, int maxCount) {
long nowTs = System.currentTimeMillis();
jedis.zremrangeByScore(key, 0, nowTs - period * 1000);
long currCount = jedis.zcard(key);
if (currCount >= maxCount) {
return false;
}
jedis.zadd(key, nowTs, "" + nowTs);
return true;
}
}Drawbacks: high memory usage for large volumes and non‑atomic check‑then‑add operation.
2. Leaky Bucket Algorithm
The leaky bucket smooths bursts by processing requests at a constant rate; excess requests are dropped when the bucket is full.
Redis‑Cell provides an atomic leaky‑bucket implementation via cl.throttle:
cl.throttle mylimit 15 30 60
# Returns: 0 (success), 15 (capacity), 14 (remaining), -1 (retry after), 2 (time to empty)3. Token Bucket Algorithm
Tokens are generated at a fixed rate; each request consumes a token. If no token is available, the request can wait or be rejected.
Guava offers a simple token‑bucket implementation:
import com.google.common.util.concurrent.RateLimiter;
import java.time.Instant;
public class RateLimiterExample {
public static void main(String[] args) {
RateLimiter rt = RateLimiter.create(10); // 10 tokens per second
for (int i = 0; i < 11; i++) {
new Thread(() -> {
rt.acquire();
System.out.println("Executed at: " + Instant.now());
}).start();
}
}
}Guava’s token bucket works in a single‑machine environment, while Redis‑Cell provides a distributed solution.
Summary
The article presents six concrete rate‑limiting techniques:
Tomcat maxThreads limit.
Nginx rate limiting with limit_req_zone and burst.
Nginx concurrent connection limiting with limit_conn_zone and limit_conn.
Time‑window algorithm using Redis ZSet.
Leaky‑bucket algorithm via Redis‑Cell.
Token‑bucket algorithm using Google Guava.
Redis‑based solutions are suitable for distributed systems; Guava is limited to single‑node deployments. When possible, container‑level limiting (Tomcat or Nginx) offers a quick, code‑free alternative.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
macrozheng
Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
