Thoughts on High‑Concurrency Traffic Control and Rate‑Limiting Techniques
This article shares practical insights on handling high‑concurrency traffic, explaining what constitutes large traffic, common mitigation strategies such as caching, downgrade, and focusing on rate‑limiting techniques—including counters, sliding windows, leaky‑bucket and token‑bucket algorithms—and demonstrates using Guava’s RateLimiter for Java applications.
Introduction
In real projects the author has encountered peaks of over 50,000 QPS and stress‑test peaks of over 100,000 QPS. This post records personal thoughts on controlling high‑concurrency traffic.
Ideas for Handling Large Traffic
Large traffic is not defined by a fixed number; any request volume that stresses the system and degrades performance can be considered large. Common mitigation methods include: Cache: bring data closer to the program to reduce frequent DB accesses. Downgrade: downgrade non‑core services during spikes. Rate limiting: restrict the number of requests within a time window, similar to limiting passenger flow in a subway during rush hour. When the core write path (e.g., e‑commerce checkout) cannot be downgraded, rate limiting becomes crucial.
Common Rate‑Limiting Approaches
The usual techniques are counters, sliding windows, leaky bucket, and token bucket.
Counter
A simple algorithm that counts requests in a fixed interval and compares the count with a threshold; the counter resets at the interval boundary. A notable issue is the “time‑boundary” problem where a burst of requests arriving exactly at the boundary can overwhelm the system.
Sliding Window
Improves on the counter by dividing time into small slices and moving the window forward, thereby smoothing out the boundary effect. The number of slices determines the precision of the algorithm.
Leaky Bucket
Uses a fixed‑size bucket where incoming requests fill the bucket at an unpredictable rate, while the outflow rate is constant. When the bucket is full, excess requests overflow and are dropped.
Token Bucket
Enhances the leaky bucket by generating tokens at a constant rate; requests can consume tokens without a strict speed limit. This allows short‑term bursts to be handled while still protecting the system from sustained overload.
Rate‑Limiting Tool: Guava RateLimiter
Guava provides a ready‑made API based on the token‑bucket algorithm. By specifying the desired QPS, RateLimiter continuously adds tokens to the bucket, and callers obtain permission via tryAcquire() .
Distributed Rate Limiting (Brief Note)
The discussed techniques apply to single‑machine scenarios. In distributed environments, solutions often combine technologies such as Nginx+Lua or Redis+Lua, but this article focuses on the single‑node case.
One‑line advice: Let traffic queue up and be rate‑limited before it reaches the core system.
If you find this sharing useful, feel free to like or forward.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Captain
Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
