Master Rate Limiting: Counter, Sliding Window, Leaky & Token Bucket Algorithms

This article introduces common rate‑limiting algorithms—counter, sliding‑window, leaky bucket, and token bucket—explains their principles, compares their advantages and drawbacks, and provides Java code examples, including how Sentinel 1.8.0 implements each technique using structures such as LeapArray and WarmUpController.

Ops Development Stories
Ops Development Stories
Ops Development Stories
Master Rate Limiting: Counter, Sliding Window, Leaky & Token Bucket Algorithms

This article explains several common rate‑limiting algorithms—counter, sliding‑window, leaky bucket, and token bucket—and shows how Sentinel 1.8.0 applies them in its source code.

Counter Rate Limiting Algorithm

By using a simple counter we can limit the number of requests accepted per second. For example, with a QPS of 1000 we start a timer on the first request; each subsequent request within the same second increments the counter, and once the counter reaches 1000 all further requests are rejected until the second ends and the counter resets to zero.

Advantages: simple to implement.

Disadvantages: if the first half‑second already consumes the limit, the second half‑second will reject all requests, a phenomenon known as the “burst” or “spike” problem.

public class Counter {
    public long timeStamp = getNowTime();
    public int reqCount = 0;
    public final int limit = 100; // maximum requests in the time window
    public final long interval = 1000; // time window in ms

    public boolean limit() {
        long now = getNowTime();
        if (now < timeStamp + interval) {
            // within the time window
            reqCount++;
            // check if the request count exceeds the limit
            return reqCount <= limit;
        } else {
            timeStamp = now;
            // reset after timeout
            reqCount = 1;
            return true;
        }
    }

    public long getNowTime() {
        return System.currentTimeMillis();
    }
}

Sliding Window Algorithm

The sliding (or rolling) window refines the counter approach by dividing the time window into multiple smaller slots, allowing a smoother count of requests. Each slot has its own counter, and the total requests in the current window are the sum of all slot counters.

By increasing the number of slots, the sliding window becomes smoother and the rate‑limiting statistics become more precise, at the cost of additional memory.

public class SlideWindow {
    private static volatile Map<String, List<Long>> MAP = new ConcurrentHashMap<>();

    private SlideWindow() {}

    public static void main(String[] args) throws InterruptedException {
        while (true) {
            System.out.println(LocalTime.now() + " " + isGo("ListId", 2, 10000L));
            Thread.sleep(1000 * new Random().nextInt(10));
        }
    }

    /**
     * Sliding window rate‑limiting algorithm.
     * @param listId   queue identifier
     * @param count    maximum allowed requests in the window
     * @param timeWindow window size in ms
     * @return true if the request is allowed
     */
    public static synchronized boolean isGo(String listId, int count, long timeWindow) {
        long nowTime = System.currentTimeMillis();
        List<Long> list = MAP.computeIfAbsent(listId, k -> new LinkedList<>());
        if (list.size() < count) {
            list.add(0, nowTime);
            return true;
        }
        Long farTime = list.get(count - 1);
        if (nowTime - farTime <= timeWindow) {
            return false;
        } else {
            list.remove(count - 1);
            list.add(0, nowTime);
            return true;
        }
    }
}

Sentinel implements the sliding window using a LeapArray structure. The core method to obtain the current bucket is shown below.

public WindowWrap<T> currentWindow(long timeMillis) {
    if (timeMillis < 0) {
        return null;
    }
    int idx = calculateTimeIdx(timeMillis);
    long windowStart = calculateWindowStart(timeMillis);
    while (true) {
        WindowWrap<T> old = array.get(idx);
        if (old == null) {
            WindowWrap<T> window = new WindowWrap<>(windowLengthInMs, windowStart, newEmptyBucket(timeMillis));
            if (array.compareAndSet(idx, null, window)) {
                return window;
            } else {
                Thread.yield();
            }
        } else if (windowStart == old.windowStart()) {
            return old;
        } else if (windowStart > old.windowStart()) {
            if (updateLock.tryLock()) {
                try {
                    return resetWindowTo(old, windowStart);
                } finally {
                    updateLock.unlock();
                }
            } else {
                Thread.yield();
            }
        } else { // windowStart < old.windowStart()
            return new WindowWrap<>(windowLengthInMs, windowStart, newEmptyBucket(timeMillis));
        }
    }
}

Leaky Bucket Algorithm

The leaky bucket algorithm is used for traffic shaping and rate limiting. It controls the rate at which data enters the network by allowing a steady outflow, smoothing burst traffic.

public class LeakyBucket {
    public long timeStamp = System.currentTimeMillis(); // current time
    public long capacity; // bucket capacity
    public long rate; // leak rate
    public long water; // current water amount (accumulated requests)

    public boolean grant() {
        long now = System.currentTimeMillis();
        // leak water based on elapsed time
        water = Math.max(0, water - (now - timeStamp) * rate);
        timeStamp = now;
        if ((water + 1) < capacity) {
            water += 1; // add water (request)
            return true;
        } else {
            return false; // bucket full, reject
        }
    }
}

In Sentinel, the leaky bucket logic is implemented in RateLimiterController. The core method decides whether a request can pass based on calculated wait times and queue limits.

public boolean canPass(Node node, int acquireCount, boolean prioritized) {
    if (acquireCount <= 0) {
        return true;
    }
    if (count <= 0) {
        return false;
    }
    long currentTime = TimeUtil.currentTimeMillis();
    long costTime = Math.round(1.0 * acquireCount / count * 1000);
    long expectedTime = costTime + latestPassedTime.get();
    if (expectedTime <= currentTime) {
        latestPassedTime.set(currentTime);
        return true;
    } else {
        long waitTime = costTime + latestPassedTime.get() - TimeUtil.currentTimeMillis();
        if (waitTime > maxQueueingTimeMs) {
            return false;
        } else {
            long oldTime = latestPassedTime.addAndGet(costTime);
            try {
                waitTime = oldTime - TimeUtil.currentTimeMillis();
                if (waitTime > maxQueueingTimeMs) {
                    latestPassedTime.addAndGet(-costTime);
                    return false;
                }
                if (waitTime > 0) {
                    Thread.sleep(waitTime);
                }
                return true;
            } catch (InterruptedException e) {
                // ignore
            }
        }
    }
    return false;
}

Token Bucket Algorithm

The token bucket algorithm is widely used for traffic shaping and rate limiting. Tokens are added to the bucket at a fixed rate; each request consumes a token. If tokens are unavailable, the request is rejected or delayed, allowing bursts up to the bucket capacity.

public class TokenBucket {
    public long timeStamp = System.currentTimeMillis(); // current time
    public long capacity; // bucket capacity
    public long rate; // token addition rate
    public long tokens; // current token count

    public boolean grant() {
        long now = System.currentTimeMillis();
        // add tokens based on elapsed time
        tokens = Math.min(capacity, tokens + (now - timeStamp) * rate);
        timeStamp = now;
        if (tokens < 1) {
            return false; // not enough tokens
        } else {
            tokens -= 1; // consume a token
            return true;
        }
    }
}

Sentinel uses the token bucket in WarmUpController to implement system warm‑up, controlling request flow during the warm‑up period.

public boolean canPass(Node node, int acquireCount, boolean prioritized) {
    long passQps = (long) node.passQps();
    long previousQps = (long) node.previousPassQps();
    syncToken(previousQps);
    long restToken = storedTokens.get();
    if (restToken >= warningToken) {
        long aboveToken = restToken - warningToken;
        double warningQps = Math.nextUp(1.0 / (aboveToken * slope + 1.0 / count));
        if (passQps + acquireCount <= warningQps) {
            return true;
        }
    } else {
        if (passQps + acquireCount <= count) {
            return true;
        }
    }
    return false;
}

Counter vs Sliding Window

The sliding window is essentially a refined counter that divides the time window into multiple slots, providing smoother statistics at the cost of more memory.

Leaky Bucket vs Token Bucket

Both aim to shape traffic, but they limit opposite directions: leaky bucket controls the outflow rate (fixed), while token bucket controls the inflow rate and permits bursts up to the bucket capacity.

Leaky bucket may under‑utilize network resources in good conditions, whereas token bucket can fully exploit available bandwidth while still limiting average rate.

References

https://www.cnblogs.com/linjiqin/p/9707713.html

https://www.cnblogs.com/dijia478/p/13807826.html

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

javaalgorithmsentinelrate limiting
Ops Development Stories
Written by

Ops Development Stories

Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.