Implementing Per‑User Rate Limiting with Alibaba Cloud Service Mesh (ASM) Traffic Scheduling Suite
This article explains how to use Alibaba Cloud Service Mesh (ASM) traffic‑scheduling suite to implement rich traffic‑control scenarios such as per‑user rate limiting, request queuing and priority scheduling in a Kubernetes environment, providing step‑by‑step deployment, configuration and verification instructions.
In distributed systems, protecting and scheduling traffic is essential for stability; common mechanisms include rate limiting, concurrency limits, request queuing, priority scheduling, tenant‑level limiting and circuit breaking. Traditional middleware often tightly couples with business logic, whereas a service mesh offers transparent, non‑intrusive traffic management.
Alibaba Cloud Service Mesh (ASM) is a fully managed, Istio‑compatible mesh that adds a traffic‑scheduling suite capable of unified load dispatch, per‑user limiting, queuing, and other advanced policies without modifying application code.
Step 1 – Deploy demo services
Two sample services (httpbin and sleep) are deployed to demonstrate per‑user limiting. The following manifests are applied:
kubectl apply -f- <<EOF
##################################################################################################
# httpbin Service example.
##################################################################################################
apiVersion: v1
kind: ServiceAccount
metadata:
name: httpbin
---
apiVersion: v1
kind: Service
metadata:
name: httpbin
labels:
app: httpbin
service: httpbin
spec:
ports:
- name: http
port: 8000
targetPort: 80
selector:
app: httpbin
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: httpbin
spec:
replicas: 1
selector:
matchLabels:
app: httpbin
version: v1
template:
metadata:
labels:
app: httpbin
version: v1
spec:
serviceAccountName: httpbin
containers:
- image: registry.cn-hangzhou.aliyuncs.com/acs/httpbin:latest
imagePullPolicy: IfNotPresent
name: httpbin
ports:
- containerPort: 80
EOF kubectl apply -f- <<EOF
##################################################################################################
# Sleep Service example.
##################################################################################################
apiVersion: v1
kind: ServiceAccount
metadata:
name: sleep
---
apiVersion: v1
kind: Service
metadata:
name: sleep
labels:
app: sleep
service: sleep
spec:
ports:
- port: 80
name: http
selector:
app: sleep
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: sleep
spec:
replicas: 1
selector:
matchLabels:
app: sleep
template:
metadata:
labels:
app: sleep
spec:
terminationGracePeriodSeconds: 0
serviceAccountName: sleep
containers:
- name: sleep
image: registry.cn-hangzhou.aliyuncs.com/acs/curl:8.1.2
command: ["/bin/sleep", "infinity"]
imagePullPolicy: IfNotPresent
volumeMounts:
- mountPath: /etc/sleep/tls
name: secret-volume
volumes:
- name: secret-volume
secret:
secretName: sleep-secret
optional: true
EOFAfter deployment, verify connectivity:
kubectl exec -it deploy/sleep -- curl -I http://httpbin:8000/headersExpected HTTP 200 response confirms the services are reachable.
Step 2 – Enable ASM traffic‑scheduling suite
Ensure the ASM instance version is ≥ 1.21 and the Kubernetes cluster is attached. Then patch the mesh configuration to enable the adaptive scheduler:
kubectl patch asmmeshconfig default --type=merge --patch='{"spec":{"adaptiveSchedulerConfiguration":{"enabled":true,"schedulerScopes":[{"namespace":"default"}]}}}'Step 3 – Create a RateLimitingPolicy for per‑user limiting
The policy uses a token‑bucket algorithm and limits traffic based on the http.request.header.user_id label, providing independent token buckets per user.
apiVersion: istio.alibabacloud.com/v1
kind: RateLimitingPolicy
metadata:
name: ratelimit
namespace: istio-system
spec:
rate_limiter:
bucket_capacity: 2
fill_amount: 2
parameters:
interval: 30s
limit_by_label_key: http.request.header.user_id
selectors:
- agent_group: default
control_point: ingress
service: httpbin.default.svc.cluster.local
EOFKey fields:
Field
Description
fill_amount
Number of tokens added each interval (2 tokens every 30 seconds in the example).
interval
Time period for token replenishment (30 s).
bucket_capacity
Maximum tokens the bucket can hold; setting equal to fill_amount disables burst traffic.
limit_by_label_key
Header key used to separate token buckets per user ( user_id).
selectors
Target services for the policy (here httpbin.default.svc.cluster.local).
Step 4 – Verify per‑user limiting
Execute the following commands from the sleep pod:
curl -H "user_id: user1" http://httpbin:8000/headers -v
curl -H "user_id: user1" http://httpbin:8000/headers -vThe second request returns HTTP 429 Too Many Requests, confirming the limit for user1. A request from a different user within the same interval succeeds:
curl -H "user_id: user2" http://httpbin:8000/headers -vResponse is HTTP 200, demonstrating isolated token buckets per user.
Observability
Each ASM traffic‑scheduling policy emits metrics that can be visualized in Grafana dashboards, enabling monitoring of rate‑limit hits, queue lengths, and other events. Integration details are documented in the official ASM guides.
Conclusion
While native Istio limits (global rate limiting, circuit breaking) may not cover complex scenarios, ASM’s traffic‑scheduling suite extends capabilities to include request priority, queuing, concurrency control, per‑user limiting, and progressive rollouts, providing a non‑intrusive foundation for building highly available cloud‑native microservice systems.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
