Understanding Rate Limiting: Counter, Sliding Window, Leaky Bucket, and Token Bucket Algorithms

The article introduces the concept of rate limiting, explains why it is needed both offline (e.g., crowded theme parks) and online (e.g., flash sales), and details four common algorithms—counter, sliding window, leaky bucket, and token bucket—along with their implementation considerations and trade‑offs.

Full-Stack Internet Architecture
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Understanding Rate Limiting: Counter, Sliding Window, Leaky Bucket, and Token Bucket Algorithms

During the National Day holiday the author visited Shanghai Disneyland with his girlfriend, only to encounter massive crowds that made the experience chaotic, illustrating the real‑world problem of uncontrolled traffic.

Just as a theme park needs to limit the number of visitors, online systems must enforce flow control to protect core services during traffic spikes such as Spring Festival train ticket rushes, Double‑11 shopping festivals, or flash sales.

🌴 What Is Rate Limiting

Definition: Rate limiting restricts the number of concurrent requests reaching a system, ensuring it can respond to a subset of users while rejecting excess traffic to maintain overall availability.

🌴 Counter Rate Limiting

This simple algorithm increments a counter for each incoming request and compares the value against a threshold; if the limit is exceeded, the request is rejected. It suffers from a “critical point” issue where bursts can temporarily exceed capacity.

🌴 Sliding Window Rate Limiting

To solve the counter’s precision problem, the sliding window divides the time interval into smaller slots (e.g., 1‑second windows within a 5‑second period) and sums the counts of the most recent slots, providing smoother and more accurate throttling.

🌴 Leaky Bucket Rate Limiting

A leaky bucket placed between the traffic producer and consumer buffers incoming requests; excess bursts are dropped, and the bucket releases requests at a fixed rate, smoothing traffic and preventing sudden overloads.

🌴 Token Bucket Rate Limiting

Tokens are added to a bucket at a constant rate; each request must consume a token before proceeding. If the bucket is empty, the request is delayed or rejected. This algorithm allows bursts while enforcing an average rate, and can be implemented with in‑memory counters, Redis, or libraries such as Guava’s RateLimiter.

In practice, choosing the right algorithm and configuring limits (static or dynamic) requires careful testing and monitoring to balance protection and user experience.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

rate limitingSliding WindowToken Bucketleaky bucketcounter algorithm
Full-Stack Internet Architecture
Written by

Full-Stack Internet Architecture

Introducing full-stack Internet architecture technologies centered on Java

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.