Cloud Native 9 min read

How to Build Redis-Powered Rate Limiting Plugins for Higress Cloud‑Native Gateway

This article explains how to extend the Higress cloud‑native gateway with WebAssembly plugins written in Go, C++ or Rust that call Redis for global rate limiting, token‑based AI usage control, and cookie‑based caching, including full code samples, configuration steps, and test results.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
How to Build Redis-Powered Rate Limiting Plugins for Higress Cloud‑Native Gateway

Higress leverages the WebAssembly (Wasm) mechanism to provide highly extensible gateway functionality, allowing developers to write plugins in Go, C++, or Rust. Recent updates add Redis support, enabling stateful plugins that can implement custom request handling such as global rate limiting, token quotas, and cookie‑based caching.

Global Rate Limiting with Redis

The built‑in Sentinel throttling protects backend services, but using a Redis plugin allows multi‑gateway global quota management. The following Go code increments a per‑minute counter in Redis and returns HTTP 429 when the limit is exceeded.

func onHttpRequestHeaders(ctx wrapper.HttpContext, config RedisCallConfig, log wrapper.Log) types.Action {
    now := time.Now()
    minuteAligned := now.Truncate(time.Minute)
    timeStamp := strconv.FormatInt(minuteAligned.Unix(), 10)
    // Increment the counter; if Redis is unreachable, log and continue the request.
    err := config.client.Incr(timeStamp, func(response resp.Value) {
        if response.Error() != nil {
            log.Errorf("call redis error: %v", response.Error())
            proxywasm.ResumeHttpRequest()
        } else {
            ctx.SetContext("timeStamp", timeStamp)
            ctx.SetContext("callTimeLeft", strconv.Itoa(config.qpm-response.Integer()))
            if response.Integer() == 1 {
                // Set a 60‑second TTL on first hit.
                config.client.Expire(timeStamp, 60, func(r resp.Value) {
                    if r.Error() != nil {
                        log.Errorf("call redis error: %v", r.Error())
                    }
                    proxywasm.ResumeHttpRequest()
                })
            } else if response.Integer() > config.qpm {
                proxywasm.SendHttpResponse(429, [][2]string{{"timeStamp", timeStamp}, {"callTimeLeft", "0"}}, []byte("Too many requests
"), -1)
            } else {
                proxywasm.ResumeHttpRequest()
            }
        }
    })
    if err != nil {
        log.Errorf("Error occured while calling redis, it seems cannot find the redis cluster.")
        return types.ActionContinue
    }
    return types.ActionPause
}

The plugin is configured via the Higress UI (see the configuration screenshot) and tested with the result image showing successful throttling.

Token Limiting for AI Services

For developers offering AI models, managing user token quotas is critical. After obtaining an API key for the Tongyi Qianwen service, the gateway is set up with appropriate routes (configuration screenshots). The plugin reads the remaining token balance from Redis during the request‑header phase and rejects the request with HTTP 429 when the quota is exhausted.

func onHttpRequestBody(ctx wrapper.HttpContext, config TokenLimiterConfig, body []byte, log wrapper.Log) types.Action {
    now := time.Now()
    minuteAligned := now.Truncate(time.Minute)
    timeStamp := strconv.FormatInt(minuteAligned.Unix(), 10)
    config.client.Get(timeStamp, func(response resp.Value) {
        if response.Error() != nil {
            defer proxywasm.ResumeHttpRequest()
            log.Errorf("Error occured while calling redis")
        } else {
            tokenUsed := response.Integer()
            if config.tpm < tokenUsed {
                proxywasm.SendHttpResponse(429, [][2]string{{"timeStamp", timeStamp}, {"TokenLeft", fmt.Sprint(config.tpm-tokenUsed)}}, []byte("No token left
"), -1)
            } else {
                proxywasm.ResumeHttpRequest()
            }
        }
    })
    return types.ActionPause
}

func onHttpResponseBody(ctx wrapper.HttpContext, config TokenLimiterConfig, body []byte, log wrapper.Log) types.Action {
    now := time.Now()
    minuteAligned := now.Truncate(time.Minute)
    timeStamp := strconv.FormatInt(minuteAligned.Unix(), 10)
    tokens := int(gjson.ParseBytes(body).Get("usage").Get("total_tokens").Int())
    config.client.IncrBy(timeStamp, tokens, func(r resp.Value) {
        if r.Error() != nil {
            defer proxywasm.ResumeHttpResponse()
            log.Errorf("Error occured while calling redis")
        } else if r.Integer() == tokens {
            config.client.Expire(timeStamp, 60, func(_ resp.Value) { defer proxywasm.ResumeHttpResponse() })
        }
    })
    return types.ActionPause
}

Test screenshots demonstrate the plugin correctly blocks requests once the token limit is reached.

Cookie‑Based Caching, Disaster Recovery, and Session Management

Beyond rate limiting, Redis can be used with cookies to cache responses, provide failover when backend services are unavailable, and store authentication data for session management. The following snippet shows how to read a cookie, query Redis, and continue processing based on the result.

func onHttpRequestHeaders(ctx wrapper.HttpContext, config HelloWorldConfig, log wrapper.Log) types.Action {
    cookieHeader, err := proxywasm.GetHttpRequestHeader("cookie")
    if err != nil {
        proxywasm.LogErrorf("error getting cookie header: %v", err)
    }
    cookie := CookieHandler(cookieHeader)
    config.client.Get(cookie, func(response resp.Value) {
        if response.Error() != nil {
            log.Errorf("Error occured while calling redis")
            proxywasm.ResumeHttpRequest()
        } else {
            // Custom business logic based on cached data
            proxywasm.ResumeHttpRequest()
        }
    })
    return types.ActionPause
}

Conclusion

By adding Redis call support, Higress plugins gain powerful stateful capabilities, enabling developers to implement global rate limiting, token‑based AI usage control, and cookie‑driven caching or session handling. The expanded functionality opens many possibilities for customized gateway behavior.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

redisWasmrate limitingPlugin Development
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.