Common Rate Limiting Algorithms: Fixed Window, Sliding Window, Sliding Log, Leaky Bucket, and Token Bucket
The article examines five common rate‑limiting algorithms—Fixed Window, Sliding Window, Sliding Log, Leaky Bucket, and Token Bucket—detailing their principles, pros and cons, and providing complete C++ implementations to help developers choose the best approach for controlling traffic bursts and ensuring system stability.
In modern high‑concurrency systems, traffic spikes and expanding business demands make rate limiting a crucial protection mechanism to prevent overload and ensure stability.
This article deeply analyzes five widely used rate‑limiting algorithms, explains their principles, lists advantages and disadvantages, and provides complete C++ code examples for each.
Table of Contents
1. Fixed Window Algorithm 2. Sliding Window Algorithm 3. Sliding Log Algorithm 4. Leaky Bucket Algorithm 5. Token Bucket Algorithm 6. Summary
1. Fixed Window Algorithm
The fixed window algorithm divides time into equal‑size windows (e.g., 1 minute) and allows a maximum number of requests per window. When a request arrives, the current window’s count is checked; if the limit is not exceeded, the request is allowed, otherwise it is rejected.
class FixedWindowRateLimiter {
public:
FixedWindowRateLimiter(int max_requests_per_win, std::chrono::seconds window_size)
: max_requests_per_win_(max_requests_per_win), window_size_(window_size), request_count_(0) {
window_start_time_ = std::chrono::steady_clock::now();
}
bool TryAcquire(int request_size) {
std::lock_guard
lock(mtx_);
auto now = std::chrono::steady_clock::now();
// If current time is within the window
if (now - window_start_time_ < window_size_) {
// Check if request count exceeds the limit
if (request_count_ + request_size <= max_requests_per_win_) {
request_count_ += request_size; // increase count
return true; // allow
} else {
return false; // exceed limit
}
} else {
// Reset window
window_start_time_ = now;
request_count_ = request_size; // start new count
return true; // allow
}
}
private:
int max_requests_per_win_; // max requests per window
std::chrono::seconds window_size_; // window size
int request_count_; // current count
std::chrono::steady_clock::time_point window_start_time_; // window start
std::mutex mtx_; // mutex
};Pros
Simple to implement and easy to understand.
Suitable for scenarios with relatively stable request rates.
Cons
During traffic bursts, many requests may be rejected; cannot smooth traffic.
Window boundary effect: a sudden surge at the edge of a window can cause overload.
2. Sliding Window Algorithm
The sliding window improves on the fixed window by dividing the window into multiple small buckets, each with its own counter. Requests are placed into the appropriate bucket based on timestamp, and the total across all buckets is checked against the limit. As time progresses, the window slides, discarding the oldest bucket.
class SlidingWindowRateLimiter {
public:
SlidingWindowRateLimiter(int max_requests_per_win, std::chrono::seconds bucket_size, int num_buckets)
: max_requests_per_win_(max_requests_per_win), bucket_size_(bucket_size), num_buckets_(num_buckets) {
request_counts_.resize(num_buckets, 0);
last_bucket_time_ = std::chrono::steady_clock::now();
}
bool TryAcquire(int request_size) {
std::lock_guard
lock(mtx_);
auto now = std::chrono::steady_clock::now();
UpdateBuckets_(now);
int total_requests = 0;
for (int count : request_counts_) {
total_requests += count;
}
if (total_requests + request_size <= max_requests_per_win_) {
request_counts_[current_bucket_index_] += request_size; // add to current bucket
return true; // allow
} else {
return false; // exceed limit
}
}
private:
void UpdateBuckets_(std::chrono::steady_clock::time_point now) {
auto elapsed = std::chrono::duration_cast
(now - last_bucket_time_);
int buckets_to_update = static_cast
(elapsed / bucket_size_);
if (buckets_to_update > 0) {
for (int i = 0; i < std::min(num_buckets_, buckets_to_update); ++i) {
auto next_bucket_index = (current_bucket_index_ + 1) % num_buckets_;
request_counts_[next_bucket_index] = 0; // clear oldest bucket
current_bucket_index_ = next_bucket_index; // move forward
}
last_bucket_time_ += buckets_to_update * bucket_size_; // update timestamp
}
}
int max_requests_per_win_; // limit
std::chrono::seconds bucket_size_; // size of each bucket
int num_buckets_; // number of buckets
std::vector
request_counts_; // per‑bucket counters
std::mutex mtx_; // mutex
std::chrono::steady_clock::time_point last_bucket_time_; // last bucket time
int current_bucket_index_ = 0; // current bucket index
};Pros
Finer‑grained control compared with fixed window.
Reduces the window‑boundary effect.
Cons
Still suffers from request bursts; many requests may be rejected.
3. Sliding Log Algorithm
The sliding log records the timestamp of each request. When a new request arrives, the algorithm checks the log for the recent time window, counts the total requests, and decides whether to allow or reject. Expired entries are periodically removed.
class SlidingLogRateLimiter {
public:
SlidingLogRateLimiter(int max_requests_per_win, std::chrono::seconds window_size)
: max_requests_per_win_(max_requests_per_win), window_size_(window_size) {}
bool TryAcquire(int request_size) {
std::lock_guard
lock(mtx_);
auto now = std::chrono::steady_clock::now();
CleanUp_(now);
int total_requests = 0;
for (const auto& timestamp : request_log_) {
total_requests += timestamp.second;
}
if (total_requests + request_size <= max_requests_per_win_) {
request_log_.emplace_back(now, request_size); // record request
return true;
} else {
return false;
}
}
private:
void CleanUp_(std::chrono::steady_clock::time_point now) {
auto expiration_time = now - window_size_;
while (!request_log_.empty() && request_log_.front().first < expiration_time) {
request_log_.pop_front(); // remove expired entry
}
}
int max_requests_per_win_; // limit
std::chrono::seconds window_size_; // window size
std::deque
> request_log_; // log of timestamps
std::mutex mtx_; // mutex
};Pros
Provides precise control over request rate.
Cons
High memory overhead because every successful request is stored.
During bursts, many requests may still be rejected.
4. Leaky Bucket Algorithm
The leaky bucket treats requests as water droplets flowing into a bucket that leaks at a constant rate. If the bucket is not full, the request is added; otherwise it is rejected. The bucket’s capacity and leak rate determine the maximum sustained throughput and burst tolerance.
class Task {
public:
virtual int GetLoad() = 0;
virtual void Run() = 0;
};
class LeakyBucketRateLimiter {
public:
LeakyBucketRateLimiter(int capacity, int leak_rate)
: capacity_(capacity), leak_rate_(leak_rate), current_water_(0) {
last_leak_time_ = std::chrono::steady_clock::now();
}
bool TryRun(const TaskPtr& task) {
std::lock_guard
lock(mtx_);
Leak();
if (current_water_ + task.GetLoad() <= capacity_) {
current_water_ += task.GetLoad(); // add request to bucket
heap_timer_.Schedule(task, current_water_ / leak_rate_); // schedule execution after leak
return true; // allow
} else {
return false; // bucket full, reject
}
}
private:
void Leak() {
auto now = std::chrono::steady_clock::now();
auto elapsed = std::chrono::duration_cast
(now - last_leak_time_);
int leaked_water = static_cast
(elapsed.count()) * leak_rate_;
if (leaked_water > 0) {
current_water_ = std::max(0, current_water_ - leaked_water); // reduce water
last_leak_time_ = now; // update timestamp
}
}
int capacity_; // bucket capacity
int leak_rate_; // leak rate per second
int current_water_; // current water amount
HeapTimer heap_timer_; // timer to execute tasks after leak
std::mutex mtx_; // mutex
std::chrono::steady_clock::time_point last_leak_time_; // last leak time
};Pros
Provides very smooth traffic flow.
Can absorb short bursts depending on bucket size.
Cons
Control is relatively rigid; elasticity is limited.
Concurrent requests may incur additional waiting latency (e.g., 1 qps limit causes one request to wait 1 s).
5. Token Bucket Algorithm
The token bucket maintains a bucket that is refilled with tokens at a fixed rate. A request consumes a token; if no token is available, the request is rejected. This allows bursts up to the bucket’s capacity while enforcing an average rate.
class TokenBucketRateLimiter {
public:
TokenBucketRateLimiter(int capacity, int refill_rate)
: capacity_(capacity), refill_rate_(refill_rate), current_tokens_(0) {
last_refill_time_ = std::chrono::steady_clock::now();
}
bool TryAcquire(int request_size) {
std::lock_guard
lock(mtx_);
Refill_();
if (current_tokens_ >= request_size) {
current_tokens_ -= request_size; // consume tokens
return true; // allow
} else {
return false; // not enough tokens
}
}
private:
void Refill_() {
auto now = std::chrono::steady_clock::now();
auto elapsed = std::chrono::duration_cast
(now - last_refill_time_);
current_tokens_ = std::min(capacity_, current_tokens_ + static_cast
(elapsed.count()) * refill_rate_);
last_refill_time_ = now; // update timestamp
}
int capacity_; // bucket capacity
int refill_rate_; // tokens added per second
int current_tokens_; // current token count
std::mutex mtx_; // mutex
std::chrono::steady_clock::time_point last_refill_time_; // last refill time
};Pros
Handles burst traffic while preventing instantaneous overload.
Highly flexible; rate and capacity can be tuned to meet different needs.
Cons
Implementation is more complex than fixed window.
6. Summary
Each rate‑limiting algorithm has its own suitable scenarios, strengths, and weaknesses. Selecting the right algorithm—or a combination of them—requires careful consideration of business requirements and system characteristics to effectively protect services from overload.
Tencent Cloud Developer
Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.