Backend Development 8 min read

Mastering Sliding Window Rate Limiting: Sentinel Implementation Explained

This article explains the sliding window rate‑limiting algorithm, its advantages over fixed windows, and provides a detailed step‑by‑step implementation using Sentinel’s data structures, including window partitioning, bucket management, concurrency handling, and calculation of request counts within a moving time interval.

Xingsheng Youxuan Technology Community

Aug 24, 2022

Mastering Sliding Window Rate Limiting: Sentinel Implementation Explained

1. Sliding Window Algorithm Introduction

Rate limiting algorithms are widely used to protect downstream services with weaker configurations from being overwhelmed by upstream traffic, applicable to both HTTP and RPC requests. Common algorithms include counter, token bucket, and leaky bucket.

The simple counter algorithm can limit an interface to, for example, 1000 requests per minute by tracking the request count within defined time intervals.

However, fixed‑window counting suffers from boundary issues: a burst of requests at the end of one minute and the start of the next can exceed the limit without triggering throttling.

To address this, sliding window, token bucket, or leaky bucket algorithms are used; this article focuses on the sliding window implementation, analogous to TCP’s sliding window.

Sliding Window Basics

The sliding window defines a time period (e.g., one minute) divided into N smaller sub‑windows (e.g., 10 sub‑windows of 6 seconds each). Each sub‑window records its own request count. As time progresses, the window slides, updating which sub‑windows are active, providing smoother and more accurate counting—the more sub‑windows, the higher the precision.

2. Sentinel Sliding Window Implementation

2.1 Determining the Target Window

Given the current timestamp, compute the array index (idx) by dividing the elapsed time by the sub‑window length and taking the modulus with the array size.

Compute the start time of the target window as timeMillis - timeMillis % windowLengthInMs. Retrieve the bucket at idx; if absent, create a new window using CAS to ensure thread safety. If the retrieved window’s start time differs from the computed start time, the window is either expired or the server clock has moved backward, requiring an update.

2.2 Window Update

If a window is expired (its start time is older than the current interval), reset its start time and count. If the server clock has moved backward (startTime < window.startTime), a new window is created but not stored, effectively discarding the stale data.

2.3 Calculating the Total Count

Iterate over all windows in the array, discard those whose currentTime - startTime exceeds the total interval, and sum the counts of the remaining valid windows to obtain the request total for the sliding interval.

3. Summary

The sliding window algorithm splits a time interval into multiple fixed‑size sub‑windows, records counts per sub‑window, and slides the window as time advances, offering smoother rate limiting. Sentinel implements this using a fixed‑length array of buckets, handling concurrency with CAS and updating expired windows as needed.

Designing a sliding window with a linked list is an open question, inviting comparison of trade‑offs with Sentinel’s array‑based approach.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend concurrency Sentinel rate limiting Sliding Window

Written by

Xingsheng Youxuan Technology Community

Xingsheng Youxuan Technology Official Account

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.