Backend Development 14 min read

Common Rate Limiting Algorithms: Fixed Window, Sliding Window, Sliding Log, Leaky Bucket, and Token Bucket

The article examines five common rate‑limiting algorithms—Fixed Window, Sliding Window, Sliding Log, Leaky Bucket, and Token Bucket—detailing their principles, pros and cons, and providing complete C++ implementations to help developers choose the best approach for controlling traffic bursts and ensuring system stability.

Tencent Cloud Developer

Dec 12, 2024

Common Rate Limiting Algorithms: Fixed Window, Sliding Window, Sliding Log, Leaky Bucket, and Token Bucket

In modern high‑concurrency systems, traffic spikes and expanding business demands make rate limiting a crucial protection mechanism to prevent overload and ensure stability.

This article deeply analyzes five widely used rate‑limiting algorithms, explains their principles, lists advantages and disadvantages, and provides complete C++ code examples for each.

Table of Contents

1. Fixed Window Algorithm 2. Sliding Window Algorithm 3. Sliding Log Algorithm 4. Leaky Bucket Algorithm 5. Token Bucket Algorithm 6. Summary

1. Fixed Window Algorithm

The fixed window algorithm divides time into equal‑size windows (e.g., 1 minute) and allows a maximum number of requests per window. When a request arrives, the current window’s count is checked; if the limit is not exceeded, the request is allowed, otherwise it is rejected.

class FixedWindowRateLimiter {
public:
    FixedWindowRateLimiter(int max_requests_per_win, std::chrono::seconds window_size)
        : max_requests_per_win_(max_requests_per_win), window_size_(window_size), request_count_(0) {
        window_start_time_ = std::chrono::steady_clock::now();
    }

    bool TryAcquire(int request_size) {
        std::lock_guard<std::mutex> lock(mtx_);
        auto now = std::chrono::steady_clock::now();
        // If current time is within the window
        if (now - window_start_time_ < window_size_) {
            // Check if request count exceeds the limit
            if (request_count_ + request_size <= max_requests_per_win_) {
                request_count_ += request_size; // increase count
                return true; // allow
            } else {
                return false; // exceed limit
            }
        } else {
            // Reset window
            window_start_time_ = now;
            request_count_ = request_size; // start new count
            return true; // allow
        }
    }

private:
    int max_requests_per_win_; // max requests per window
    std::chrono::seconds window_size_; // window size
    int request_count_; // current count
    std::chrono::steady_clock::time_point window_start_time_; // window start
    std::mutex mtx_; // mutex
};

Pros

Simple to implement and easy to understand.

Suitable for scenarios with relatively stable request rates.

Cons

During traffic bursts, many requests may be rejected; cannot smooth traffic.

Window boundary effect: a sudden surge at the edge of a window can cause overload.

2. Sliding Window Algorithm

The sliding window improves on the fixed window by dividing the window into multiple small buckets, each with its own counter. Requests are placed into the appropriate bucket based on timestamp, and the total across all buckets is checked against the limit. As time progresses, the window slides, discarding the oldest bucket.

class SlidingWindowRateLimiter {
public:
    SlidingWindowRateLimiter(int max_requests_per_win, std::chrono::seconds bucket_size, int num_buckets)
        : max_requests_per_win_(max_requests_per_win), bucket_size_(bucket_size), num_buckets_(num_buckets) {
        request_counts_.resize(num_buckets, 0);
        last_bucket_time_ = std::chrono::steady_clock::now();
    }

    bool TryAcquire(int request_size) {
        std::lock_guard<std::mutex> lock(mtx_);
        auto now = std::chrono::steady_clock::now();
        UpdateBuckets_(now);
        int total_requests = 0;
        for (int count : request_counts_) {
            total_requests += count;
        }
        if (total_requests + request_size <= max_requests_per_win_) {
            request_counts_[current_bucket_index_] += request_size; // add to current bucket
            return true; // allow
        } else {
            return false; // exceed limit
        }
    }

private:
    void UpdateBuckets_(std::chrono::steady_clock::time_point now) {
        auto elapsed = std::chrono::duration_cast<std::chrono::seconds>(now - last_bucket_time_);
        int buckets_to_update = static_cast<int>(elapsed / bucket_size_);
        if (buckets_to_update > 0) {
            for (int i = 0; i < std::min(num_buckets_, buckets_to_update); ++i) {
                auto next_bucket_index = (current_bucket_index_ + 1) % num_buckets_;
                request_counts_[next_bucket_index] = 0; // clear oldest bucket
                current_bucket_index_ = next_bucket_index; // move forward
            }
            last_bucket_time_ += buckets_to_update * bucket_size_; // update timestamp
        }
    }

    int max_requests_per_win_; // limit
    std::chrono::seconds bucket_size_; // size of each bucket
    int num_buckets_; // number of buckets
    std::vector<int> request_counts_; // per‑bucket counters
    std::mutex mtx_; // mutex
    std::chrono::steady_clock::time_point last_bucket_time_; // last bucket time
    int current_bucket_index_ = 0; // current bucket index
};

Pros

Finer‑grained control compared with fixed window.

Reduces the window‑boundary effect.

Cons

Still suffers from request bursts; many requests may be rejected.

3. Sliding Log Algorithm

The sliding log records the timestamp of each request. When a new request arrives, the algorithm checks the log for the recent time window, counts the total requests, and decides whether to allow or reject. Expired entries are periodically removed.

class SlidingLogRateLimiter {
public:
    SlidingLogRateLimiter(int max_requests_per_win, std::chrono::seconds window_size)
        : max_requests_per_win_(max_requests_per_win), window_size_(window_size) {}

    bool TryAcquire(int request_size) {
        std::lock_guard<std::mutex> lock(mtx_);
        auto now = std::chrono::steady_clock::now();
        CleanUp_(now);
        int total_requests = 0;
        for (const auto& timestamp : request_log_) {
            total_requests += timestamp.second;
        }
        if (total_requests + request_size <= max_requests_per_win_) {
            request_log_.emplace_back(now, request_size); // record request
            return true;
        } else {
            return false;
        }
    }

private:
    void CleanUp_(std::chrono::steady_clock::time_point now) {
        auto expiration_time = now - window_size_;
        while (!request_log_.empty() && request_log_.front().first < expiration_time) {
            request_log_.pop_front(); // remove expired entry
        }
    }

    int max_requests_per_win_; // limit
    std::chrono::seconds window_size_; // window size
    std::deque<std::pair<std::chrono::steady_clock::time_point, int>> request_log_; // log of timestamps
    std::mutex mtx_; // mutex
};

Pros

Provides precise control over request rate.

Cons

High memory overhead because every successful request is stored.

During bursts, many requests may still be rejected.

4. Leaky Bucket Algorithm

The leaky bucket treats requests as water droplets flowing into a bucket that leaks at a constant rate. If the bucket is not full, the request is added; otherwise it is rejected. The bucket’s capacity and leak rate determine the maximum sustained throughput and burst tolerance.

class Task {
public:
    virtual int GetLoad() = 0;
    virtual void Run() = 0;
};

class LeakyBucketRateLimiter {
public:
    LeakyBucketRateLimiter(int capacity, int leak_rate)
        : capacity_(capacity), leak_rate_(leak_rate), current_water_(0) {
        last_leak_time_ = std::chrono::steady_clock::now();
    }

    bool TryRun(const TaskPtr& task) {
        std::lock_guard<std::mutex> lock(mtx_);
        Leak();
        if (current_water_ + task.GetLoad() <= capacity_) {
            current_water_ += task.GetLoad(); // add request to bucket
            heap_timer_.Schedule(task, current_water_ / leak_rate_); // schedule execution after leak
            return true; // allow
        } else {
            return false; // bucket full, reject
        }
    }

private:
    void Leak() {
        auto now = std::chrono::steady_clock::now();
        auto elapsed = std::chrono::duration_cast<std::chrono::seconds>(now - last_leak_time_);
        int leaked_water = static_cast<int>(elapsed.count()) * leak_rate_;
        if (leaked_water > 0) {
            current_water_ = std::max(0, current_water_ - leaked_water); // reduce water
            last_leak_time_ = now; // update timestamp
        }
    }

    int capacity_; // bucket capacity
    int leak_rate_; // leak rate per second
    int current_water_; // current water amount
    HeapTimer heap_timer_; // timer to execute tasks after leak
    std::mutex mtx_; // mutex
    std::chrono::steady_clock::time_point last_leak_time_; // last leak time
};

Pros

Provides very smooth traffic flow.

Can absorb short bursts depending on bucket size.

Cons

Control is relatively rigid; elasticity is limited.

Concurrent requests may incur additional waiting latency (e.g., 1 qps limit causes one request to wait 1 s).

5. Token Bucket Algorithm

The token bucket maintains a bucket that is refilled with tokens at a fixed rate. A request consumes a token; if no token is available, the request is rejected. This allows bursts up to the bucket’s capacity while enforcing an average rate.

class TokenBucketRateLimiter {
public:
    TokenBucketRateLimiter(int capacity, int refill_rate)
        : capacity_(capacity), refill_rate_(refill_rate), current_tokens_(0) {
        last_refill_time_ = std::chrono::steady_clock::now();
    }

    bool TryAcquire(int request_size) {
        std::lock_guard<std::mutex> lock(mtx_);
        Refill_();
        if (current_tokens_ >= request_size) {
            current_tokens_ -= request_size; // consume tokens
            return true; // allow
        } else {
            return false; // not enough tokens
        }
    }

private:
    void Refill_() {
        auto now = std::chrono::steady_clock::now();
        auto elapsed = std::chrono::duration_cast<std::chrono::seconds>(now - last_refill_time_);
        current_tokens_ = std::min(capacity_, current_tokens_ + static_cast<int>(elapsed.count()) * refill_rate_);
        last_refill_time_ = now; // update timestamp
    }

    int capacity_; // bucket capacity
    int refill_rate_; // tokens added per second
    int current_tokens_; // current token count
    std::mutex mtx_; // mutex
    std::chrono::steady_clock::time_point last_refill_time_; // last refill time
};

Pros

Handles burst traffic while preventing instantaneous overload.

Highly flexible; rate and capacity can be tuned to meet different needs.

Cons

Implementation is more complex than fixed window.

6. Summary

Each rate‑limiting algorithm has its own suitable scenarios, strengths, and weaknesses. Selecting the right algorithm—or a combination of them—requires careful consideration of business requirements and system characteristics to effectively protect services from overload.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend distributed systems Performance algorithm rate limiting C++

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.