Handling High Traffic: Common Rate‑Limiting Techniques and Guava RateLimiter
This article discusses the definition of high traffic, common mitigation methods such as caching, degradation, and especially various rate‑limiting algorithms—including counters, sliding windows, leaky bucket, and token bucket—and demonstrates using Guava's RateLimiter for practical throttling.
Introduction
In real projects I have encountered peaks of over 50 K QPS and, under load testing, traffic exceeding 100 K QPS. This blog shares my thoughts on controlling high‑concurrency traffic.
Ideas for Dealing with Large Traffic
What constitutes "large traffic"? Metrics like TPS or QPS (e.g., 10 K+, 50 K+, 100 K+) are relative; any volume that stresses the system and degrades performance can be considered large.
Common measures include:
Cache : Move data closer to the application to reduce frequent DB accesses.
Degradation : Non‑core services can be downgraded; for example, personalized sorting can be omitted under heavy load.
Rate limiting : Similar to limiting entry to a subway during rush hour, it restricts the number of requests in a given time window to protect the system while maximizing throughput.
In scenarios like e‑commerce flash sales, where write operations are core and cannot be degraded, rate limiting becomes essential.
Common Rate‑Limiting Methods
Typical techniques are counters, sliding windows, leaky bucket, and token bucket.
Counter
A simple algorithm that counts requests within a time interval and compares the count to a threshold; the counter resets at the interval boundary.
Be aware of the boundary‑time issue: a burst of requests arriving exactly at the reset moment can cause a sudden spike.
Sliding Window
To avoid the counter's boundary problem, the sliding‑window algorithm divides time into fixed slots that move forward, allowing more granular counting.
The number of slots determines the precision of the sliding‑window calculation.
Leaky Bucket
Improves on sliding windows by modeling a bucket that receives water (requests) at an unpredictable rate but releases it at a constant rate; excess water overflows.
Token Bucket
Addresses the leaky bucket's limitation of constant outflow by generating tokens at a steady rate; requests can acquire tokens without speed limits, allowing short bursts while still protecting the system.
Both token‑bucket rejections and leaky‑bucket overflows sacrifice a small portion of traffic to ensure the majority can be served safely.
Rate‑Limiting Tool: Guava RateLimiter
Guava provides a powerful API for rate limiting based on the token‑bucket algorithm. By specifying the desired QPS, RateLimiter continuously adds tokens to the bucket, and callers obtain permits via tryAcquire().
Rate Limiting in Distributed Scenarios
The methods described above target single‑machine environments; most production systems require distributed solutions (e.g., Nginx+Lua, Redis+Lua). This article focuses on single‑node rate limiting.
Source: http://blog.51cto.com/zhangfengzhe/2066683
Copyright notice: Content originates from the internet; rights belong to the original author. We will credit sources and remove infringing material upon request.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
