Backend Development 16 min read

Mastering Rate Limiting: Concepts, Algorithms, and Real-World Implementations

This article explains the fundamental concepts of rate limiting, including time and resource dimensions, various rule types such as QPS, connection count, bandwidth, black/white lists, and distributed considerations, then details common algorithms like token bucket, leaky bucket, sliding window, and practical implementations using Nginx, Guava, Redis, and Sentinel.

MaGe Linux Operations

Mar 27, 2023

Mastering Rate Limiting: Concepts, Algorithms, and Real-World Implementations

Basic Concepts of Rate Limiting

For typical rate‑limiting scenarios there are two dimensions of information:

Time : Rate limiting is based on a time window, e.g., per minute or per second.

Resource : Limits based on available resources, such as maximum request count or maximum concurrent connections.

Combining these, rate limiting restricts resource access within a time window, e.g., at most 100 requests per second. In practice multiple rules are applied simultaneously, including:

QPS and Connection Count Control

Both IP‑level and server‑level limits can be set. Real‑world deployments often define several dimensions, such as limiting each IP to less than 10 QPS and less than 5 connections, each machine to a maximum of 1000 QPS and 200 connections, and even higher‑level limits for a server group or data center.

Bandwidth Control

Bandwidth throttling can differentiate users, e.g., regular users download at 100 KB/s while premium members at 10 MB/s, based on user groups or tags.

Black‑ and White‑List

Blacklists block abusive IPs identified as bots or attackers; whitelists grant unrestricted access to trusted accounts such as large sellers.

Distributed Environment

In distributed setups the whole cluster is treated as a single entity. Rate‑limit information should be stored centrally, typically via gateway‑level limiting, middleware limiting (e.g., Redis), or Sentinel in the Spring Cloud ecosystem.

Common Rate‑Limiting Algorithms

Token Bucket

The token‑bucket algorithm uses two key components: tokens and a bucket. Tokens are generated at a fixed rate and placed into a bucket with a fixed capacity. A request can proceed only if it obtains a token; otherwise it is queued or dropped. Optional buffering queues can hold excess requests until new tokens appear.

Leaky Bucket

The leaky‑bucket algorithm stores incoming requests in a bucket and drains them at a constant rate, ensuring a steady output regardless of bursty input.

Sliding Window

A sliding‑window counter aggregates request counts over a moving time interval, providing smoother throttling as the window length increases.

Typical Rate‑Limiting Solutions

Legality Verification

CAPTCHA, IP blacklists, etc., to prevent malicious attacks and crawlers.

Guava RateLimiter

Provides in‑process rate limiting for a single JVM; not suitable for distributed systems.

Gateway‑Level Limiting

Implemented via Nginx, Spring Cloud Gateway, or Zuul. Nginx offers two methods: rate control (limit_req_zone, burst) and concurrent connection control (limit_conn_zone, limit_conn).

Middleware Limiting

Redis can store counters with expiration, and Lua scripts can enforce limits atomically across the cluster.

Dedicated Components

Open‑source solutions such as Sentinel offer rich APIs and a visual console for managing limits.

Architectural Considerations

Real projects combine multiple limiting techniques at different layers to maximize resource utilization while maintaining high availability.

Practical Implementation Examples

Tomcat: configure maxThreads to cap concurrent requests.

Nginx: use limit_req_zone with burst for rate limiting.

Nginx: use limit_conn_zone and limit_conn for concurrent connection limits.

Redis sorted sets for sliding‑window counters.

Redis‑Cell for leaky‑bucket implementation.

Guava’s RateLimiter for token‑bucket style limiting.

Note: Redis‑based limits work in distributed systems, whereas Guava limits are limited to a single machine.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend distributed systems rate limiting Token Bucket Leaky Bucket

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.