Backend Development 5 min read

Mastering Rate Limiting: Counter, Sliding Window, Leaky Bucket & Token Bucket Explained

This article explains four common rate‑limiting algorithms—fixed‑window counter, sliding window, leaky bucket, and token bucket—detailing how each works, their pseudo‑code implementations, and when to use them to protect high‑traffic systems such as flash‑sale platforms.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Mastering Rate Limiting: Counter, Sliding Window, Leaky Bucket & Token Bucket Explained

Rate Limiting Overview

High concurrency can be handled by caching, throttling, and degradation. When traffic spikes (e.g., flash‑sale events), protecting the system requires throttling mechanisms.

Four Common Rate‑Limiting Algorithms

1. Fixed‑Window Counter

The counter records the number of requests in a fixed time window; requests within the limit are processed normally, while excess requests are rejected or handled asynchronously.

<code>// Pseudo code
++counter;
if (counter > limit) {
    return 'System busy, please try later';
}
</code>

2. Sliding Window

The sliding window divides a larger interval into smaller cells, each with its own counter. The sum of counters in the current window determines whether a request is accepted.

<code>// Pseudo code
var cellIndex = time % cellNum;
++cellCounter[cellIndex];
var sum = 0;
for (var i = cellIndex; i >= cellIndex - cellNum; --i) {
    sum += cellCounter[i];
}
if (sum > limit) {
    return 'System busy, please try later';
}
</code>

3. Leaky Bucket

The leaky bucket models a fixed‑capacity bucket that drains at a constant rate; incoming requests fill the bucket, and overflow is discarded, effectively limiting the request rate.

<code>// Pseudo code
++counter;
var time = nowTime - (nowTime % interval);
var rate = counter / time;
if (rate > limitRate) {
    return 'System busy, please try later';
}
</code>

4. Token Bucket

Tokens are generated at a steady rate and stored in a bucket; a request must acquire a token before being processed, otherwise it is rejected. This allows short bursts while enforcing an average rate.

performancealgorithmscalabilityrate limitingthrottling
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.