How to Build a Redis‑Powered Rate Limiter Using Token Bucket and Lua
This article explains the design of a Redis‑based rate‑limiting system, covering core concepts, token‑bucket and window‑based algorithms, Java and Lua implementations, engineering choices, performance benchmarks, and practical lessons learned.
1. Concept
Rate limiting controls inbound and outbound traffic to protect scarce resources (e.g., write APIs, high‑traffic read interfaces). Two core notions are:
Resource : the entity whose access is limited.
Strategy : a limiting algorithm together with configurable parameters.
Break‑circuit strategy – the handling of requests that exceed the configured threshold (author‑defined term).
2. Rate‑Limiting Algorithms
2.1 Instant Concurrency Limit
Limits the number of requests processed simultaneously.
Advantages: directly caps concurrent requests.
Disadvantages: mainly useful for inbound traffic.
AtomicInteger atomic = new AtomicInteger(1);
try {
if (atomic.incrementAndGet() > limit) {
// circuit‑break logic
} else {
// processing logic
}
} finally {
atomic.decrementAndGet();
}2.2 Window‑Based Maximum Request Count
Limits the total number of requests that can occur within a fixed time window.
Advantages: maps directly to QPS (QPS = request count / window duration).
Disadvantages: traffic may burst within the window.
-- Lua script for window‑based limiting
local key = KEYS[1]
local max_window_concurrency = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local curr_window_concurrency = tonumber(redis.call('get', key) or 0)
if curr_window_concurrency + 1 > max_window_concurrency then
return false
else
redis.call('INCRBY', key, 1)
if window > -1 then
redis.call('expire', key, window)
end
return true
end2.3 Token Bucket
If the configured average rate is r , a token is added every 1/r seconds.
The bucket can hold at most b tokens; excess tokens are discarded.
Incoming requests consume v tokens; if a token is available the request passes, otherwise it is rejected or triggers a circuit‑break.
Properties:
Long‑term traffic stabilises to r .
The bucket absorbs short‑term bursts.
Maximum possible transmission rate M (bytes/s) satisfies M > r . The maximum burst duration is Tmax = b / (M‑r), and the total burst volume is Bmax = Tmax * M.
Advantages: smooth traffic flow and burst tolerance.
3. Engineering Implementation
3.1 Technology Stack
MySQL – stores metadata of rate‑limiting policies.
Redis + Lua – executes the token‑bucket algorithm.
Redis serves as both cache and computation medium; metadata resides in MySQL while bucket state lives in Redis.
3.2 Architecture Overview
3.3 Data Structure (Redis Hash)
name : unique identifier of the token bucket.
apps : list of applications permitted to use the bucket.
max_permits : maximum number of tokens the bucket can hold.
rate : token addition rate (tokens per second).
created_by , updated_by : audit fields.
The apps field enables per‑application management and simplifies troubleshooting.
4. Code Issues and Solutions
4.1 Dedicated Token‑Adder Thread
One thread per bucket creates resource overhead.
High token‑addition rates (e.g., r > 1000) cannot be reliably achieved with Java thread scheduling.
4.2 Trigger‑Based Token Addition (Guava‑Inspired)
Tokens are added lazily when a request attempts to acquire them. The elapsed time since the last acquisition determines how many tokens to replenish.
-- Lua function to acquire tokens (trigger‑based)
local function acquire(key, permits, curr_mill_second, context)
local info = redis.pcall('HMGET', key, 'last_mill_second', 'curr_permits', 'max_permits', 'rate', 'apps')
local last = info[1]
local curr = tonumber(info[2])
local max = tonumber(info[3])
local rate = info[4]
local apps = info[5]
if type(apps) == 'boolean' or apps == nil or not contains(apps, context) then
return 0 -- bucket not configured for this app
end
local available = max
if type(last) ~= 'boolean' and last ~= false and last ~= nil then
local added = math.floor(((curr_mill_second - last) / 1000) * rate)
if added > 0 then
available = math.min(curr + added, max)
redis.pcall('HSET', key, 'last_mill_second', curr_mill_second)
end
else
redis.pcall('HSET', key, 'last_mill_second', curr_mill_second)
end
if available - permits >= 0 then
redis.pcall('HSET', key, 'curr_permits', available - permits)
return 1 -- success
else
redis.pcall('HSET', key, 'curr_permits', available)
return -1 -- not enough tokens
end
endFull source code is available at the following repository URL:
https://github.com/wukq/rate-limiter
4.3 Management UI
A unified web interface allows administrators to manage rate‑limit configurations per application, assign permission levels, and edit policies.
5. Performance Test
Environment: AWS Elasticache Redis, 2 CPU, 4 GB RAM.
Single‑thread token acquisition: ~250 req/s.
10 concurrent threads: ~2 000 req/s.
100 concurrent threads: ~5 000 req/s.
6. Conclusion and Limitations
The rate‑limiting system is simple, effective, and integrates with existing permission models. Key limitations include:
Redis is deployed as a single instance without high‑availability clustering.
Enforcement is currently at the application layer only; no API‑gateway integration.
Circuit‑break policies must be defined manually; reusable templates would improve usability.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
