Mastering Sliding Window Rate Limiting in Sentinel-Go: From Theory to Code

This article explores why engineering skills outrank pure algorithm knowledge, explains the challenges of implementing flow‑control algorithms such as token bucket and sliding‑window rate limiting, and walks through Sentinel‑Go's concrete LeapArray implementation with detailed Go code examples.

Xiao Lou's Tech Notes
Xiao Lou's Tech Notes
Xiao Lou's Tech Notes
Mastering Sliding Window Rate Limiting in Sentinel-Go: From Theory to Code

Algorithm vs. Engineering Implementation

In Sentinel‑Go, a core algorithm is the flow‑control (rate‑limiting) algorithm. While the concept of flow control is familiar, writing a correct implementation is tricky because the practical engineering details differ from the textbook algorithm.

The token‑bucket algorithm is easy to understand: a bucket receives tokens at a constant rate, discarding excess, and a request proceeds only if it can take a token. Implementing this naïvely would require a dedicated thread to refill the bucket, raising concerns about thread stability and failure.

Evolution of Time‑Window Strategies

Rate limiting is usually measured in QPS (queries per second). QPS can also be expressed as TPS (transactions per second) or RPS (requests per second).

Concurrency‑Based Limiting

Concurrency is an instantaneous metric. By incrementing a global counter at the start of a request and decrementing it at the end, one can easily enforce a concurrency limit without complex code.

Fixed Time Window

A naive fixed‑window approach would reset a counter every second using a separate thread, but this can lead to bursty traffic that overwhelms the system.

Sliding Time Window

To avoid the pitfalls of fixed windows, engineers use a sliding window that looks back from the current time instead of using rigid intervals.

Engineering the Sliding Window

The implementation divides time into small sampling windows and caches a series of these windows. When a request arrives, the system sums the counts of the relevant windows to obtain the total request count.

Sentinel‑Go Sliding Window Implementation

Code high‑energy warning ahead.

Sentinel‑Go builds its sliding window on top of LeapArray:

type LeapArray struct {
    bucketLengthInMs uint32 // bucket size
    sampleCount      uint32 // number of buckets
    intervalInMs      uint32 // total window size
    array            *AtomicBucketWrapArray // bucket array
    updateLock       mutex // update lock
}

type AtomicBucketWrapArray struct {
    base   unsafe.Pointer // start address of the array
    length int // fixed length
    data   []*BucketWrap // actual bucket data
}

type BucketWrap struct {
    BucketStart uint64 // bucket start time
    Value       atomic.Value // bucket data, e.g., MetricBucket
}

type MetricBucket struct {
    counter        [base.MetricEventTotal]int64 // counters for different metrics
    minRt          int64 // minimum response time
    maxConcurrency int32 // maximum concurrency
}

Metrics are written to the appropriate bucket, with special handling for response time (RT) metrics.

// Example of adding a count
sn.AddCount(base.MetricEventPass, int64(count))

func (bla *BucketLeapArray) AddCount(event base.MetricEvent, count int64) {
    bla.addCountWithTime(util.CurrentTimeMillis(), event, count)
}

func (bla *BucketLeapArray) addCountWithTime(now uint64, event base.MetricEvent, count int64) {
    b := bla.currentBucketWithTime(now)
    if b == nil {
        return
    }
    b.Add(event, count)
}

func (mb *MetricBucket) Add(event base.MetricEvent, count int64) {
    if event >= base.MetricEventTotal || event < 0 {
        logging.Error(errors.Errorf("Unknown metric event: %v", event), "")
        return
    }
    if event == base.MetricEventRt {
        mb.AddRt(count)
        return
    }
    mb.addCount(event, count)
}

func (mb *MetricBucket) addCount(event base.MetricEvent, count int64) {
    atomic.AddInt64(&mb.counter[event], count)
}

The crucial part is locating the correct bucket based on the current timestamp:

func (bla *BucketLeapArray) currentBucketWithTime(now uint64) *MetricBucket {
    // ① Get bucket for current time
    curBucket, err := bla.data.currentBucketOfTime(now, bla)
    // ... type assertion and return
}

func (la *LeapArray) currentBucketOfTime(now uint64, bg BucketGenerator) (*BucketWrap, error) {
    // ② Compute index = (now / bucketLengthInMs) % la.array.length
    idx := la.calculateTimeIdx(now)
    // ③ Compute bucket start time = now - (now % bucketLengthInMs)
    bucketStart := calculateStartTime(now, la.bucketLengthInMs)
    for {
        old := la.array.get(idx)
        if old == nil { // ④ unused slot, create new bucket
            newWrap := &BucketWrap{BucketStart: bucketStart, Value: atomic.Value{}}
            newWrap.Value.Store(bg.NewEmptyBucket())
            if la.array.compareAndSet(idx, nil, newWrap) {
                return newWrap, nil
            }
            runtime.Gosched()
        } else if bucketStart == atomic.LoadUint64(&old.BucketStart) { // ⑤ current bucket
            return old, nil
        } else if bucketStart > atomic.LoadUint64(&old.BucketStart) { // ⑥ old bucket, reset
            if la.updateLock.TryLock() {
                old = bg.ResetBucketTo(old, bucketStart)
                la.updateLock.Unlock()
                return old, nil
            }
            runtime.Gosched()
        } else { // ⑦ newer bucket (rare case)
            if la.sampleCount == 1 {
                return old, nil
            }
            return nil, errors.New(fmt.Sprintf("Provided time timeMillis=%d is already behind old.BucketStart=%d.", bucketStart, old.BucketStart))
        }
    }
}

To compute QPS, Sentinel‑Go aggregates counts from all buckets within the sliding window range:

qps := stat.InboundNode().GetQPS(base.MetricEventPass)

The method first determines the start and end timestamps of the window and then sums the counters of the buckets that fall inside this range.

func (m *SlidingWindowMetric) getBucketStartRange(timeMs uint64) (start, end uint64) {
    curBucketStartTime := calculateStartTime(timeMs, m.real.BucketLengthInMs())
    end = curBucketStartTime
    start = end - uint64(m.intervalInMs) + uint64(m.real.BucketLengthInMs())
    return
}

For example, if the current time is 3500 ms, the window spans from 2400 ms to 3400 ms, and the system sums the counts of all buckets whose start times fall within this interval.

Conclusion

The article walks through the evolution from simple concurrency limiting to fixed and sliding time windows, and then dives deep into Sentinel‑Go's concrete implementation, highlighting challenges such as memory usage, concurrency control, and accurate bucket management.

While this piece concludes the discussion on sliding‑window rate limiting, the broader "Sentinel‑Go source code series" will continue with topics like object pools and other architectural patterns.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

backendrate limitingSliding Windowalgorithm implementationSentinel Go
Xiao Lou's Tech Notes
Written by

Xiao Lou's Tech Notes

Backend technology sharing, architecture design, performance optimization, source code reading, troubleshooting, and pitfall practices

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.