Build a Sidecarless AI Application with Alibaba Cloud Service Mesh ASM – Step‑by‑Step Guide
This guide walks you through creating a sidecarless AI demo on Alibaba Cloud Service Mesh ASM, covering environment setup, multi‑model serving with KServe, PVC storage, InferenceService configuration, business service deployment, gateway and waypoint creation, traffic routing rules, and OIDC single sign‑on integration.
Prerequisites
You need an ACK cluster, an ASM instance (version 1.18.0.131+), and tools like istioctl. Ensure the ASM instance has Ambient Mesh mode enabled and the cluster is added to the instance.
1. Enable Multi‑Model Inference Service
Create a global namespace modelmesh-serving in ASM. Use kubectl to connect to the ASM control plane and apply the following configuration to enable the multi‑model feature:
apiVersion: istio.alibabacloud.com/v1beta1
kind: ASMKServeConfig
metadata:
name: default
spec:
enabled: true
multiModel: true
tag: v0.11.0Apply it with kubectl apply -f asmkserveconfig.yaml. A modelmesh-serving namespace with the necessary runtime workloads will appear.
2. Prepare Model Files and Declare Inference Services
Download a TensorFlow model from TensorFlow Hub and a PyTorch model (converted to ONNX) from the official tutorial. Organize them as:
$ ls -R
pytorch tensorflow
./pytorch:
style-transfer
./pytorch/style-transfer:
candy.onnx
./tensorflow:
style-transfer
./tensorflow/style-transfer:
saved_model.pb variables
./tensorflow/style-transfer/variables:
variables.data-00000-of-00002 variables.data-00001-of-00002 variables.indexCreate a PVC (e.g., my-models-pvc) using a storage class, then copy the model files into the PVC via a temporary pod:
apiVersion: v1
kind: Pod
metadata:
name: pvc-access
namespace: modelmesh-serving
spec:
containers:
- name: main
image: ubuntu
command: ["/bin/sh", "-ec", "sleep 10000"]
volumeMounts:
- name: my-pvc
mountPath: "/mnt/models"
volumes:
- name: my-pvc
persistentVolumeClaim:
claimName: my-models-pvcCopy files with:
kubectl cp -n modelmesh-serving tensorflow pvc-access:/mnt/models/
kubectl cp -n modelmesh-serving pytorch pvc-access:/mnt/models/Verify the copy:
kubectl exec -n modelmesh-serving pvc-access -- ls /mnt/modelsDefine two InferenceService resources (one for TensorFlow, one for ONNX) in isvc.yaml:
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: tf-style-transfer
namespace: modelmesh-serving
annotations:
serving.kserve.io/deploymentMode: ModelMesh
spec:
predictor:
model:
modelFormat:
name: tensorflow
storage:
parameters:
type: pvc
name: my-models-pvc
path: tensorflow/style-transfer/
---
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: pt-style-transfer
namespace: modelmesh-serving
annotations:
serving.kserve.io/deploymentMode: ModelMesh
spec:
predictor:
model:
modelFormat:
name: onnx
storage:
parameters:
type: pvc
name: my-models-pvc
path: pytorch/style-transfer/Apply with kubectl apply -f isvc.yaml. Both services become ready, and the appropriate runtimes (Triton for TensorFlow, OVMS for ONNX) are launched.
3. Deploy Business Services
Create a namespace apsara-demo and apply ai-apps.yaml which defines service accounts, deployments for the AI backend and the two style‑transfer workloads, and corresponding services:
kubectl create namespace apsara-demo
kubectl apply -f ai-apps.yamlThe deployments use images from Alibaba Cloud Container Registry and expose port 8000.
4. Set Up ASM Gateway, Waypoint, and Traffic Rules
Create two ASM ingress gateways (one LoadBalancer on port 80, one ClusterIP on port 8008). Enable Ambient Mesh mode for the apsara-demo namespace via the ASM console.
Deploy a waypoint proxy for the apsara-demo namespace:
istioctl x waypoint apply --service-account style-transfer -n apsara-demoVerify the waypoint pod appears.
Model‑mesh Routing (modelsvc-routing.yaml)
Define a Gateway, VirtualService, DestinationRule, and a JSON‑to‑gRPC transcoder to route requests to the correct runtime based on the x-model-format-* headers:
# modelsvc-routing.yaml (excerpt)
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: grpc-gateway
namespace: modelmesh-serving
spec:
selector:
istio: ingressgateway
servers:
- hosts: ['*']
port:
name: grpc
number: 8008
protocol: GRPC
- hosts: ['*']
port:
name: http
number: 80
protocol: HTTP
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: vs-modelmesh-serving-service
namespace: modelmesh-serving
spec:
gateways: [grpc-gateway]
hosts: ['*']
http:
- headerToDynamicSubsetKey:
- header: x-model-format-tensorflow
key: model.format.tensorflow
- header: x-model-format-pytorch
key: model.format.pytorch
match:
- port: 8008
name: default
route:
- destination:
host: modelmesh-serving
port:
number: 8033
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: dr-modelmesh-serving-service
namespace: modelmesh-serving
spec:
host: modelmesh-serving-service
trafficPolicy:
loadBalancer:
dynamicSubset:
subsetSelectors:
- keys: [model.format.tensorflow]
- keys: [model.format.pytorch]
---
apiVersion: istio.alibabacloud.com/v1beta1
kind: ASMGrpcJsonTranscoder
metadata:
name: grpcjsontranscoder-for-kservepredictv2
namespace: istio-system
spec:
builtinProtoDescriptor: kserve_predict_v2
isGateway: true
portNumber: 8008
workloadSelector:
labels:
istio: ingressgatewayApply with kubectl apply -f modelsvc-routing.yaml.
Application Routing (app-routing.yaml)
Define a gateway for the AI app, a virtual service routing to the backend, and a virtual service that splits traffic between the TensorFlow and PyTorch style‑transfer workloads based on the user_class JWT claim:
# app-routing.yaml (excerpt)
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: ai-app-gateway
namespace: apsara-demo
spec:
selector:
istio: api-ingressgateway
servers:
- hosts: ['*']
port:
name: http
number: 8000
protocol: HTTP
- hosts: ['*']
port:
name: http-80
number: 80
protocol: HTTP
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: ai-app-vs
namespace: apsara-demo
spec:
gateways: [ai-app-gateway]
hosts: ['*']
http:
- route:
- destination:
host: ai-backend-svc
port:
number: 8000
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: style-transfer-vs
namespace: apsara-demo
spec:
hosts: [style-transfer.apsara-demo.svc.cluster.local]
http:
- match:
- headers:
user_class:
exact: premium
route:
- destination:
host: style-transfer.apsara-demo.svc.cluster.local
port:
number: 8000
subset: tensorflow
- route:
- destination:
host: style-transfer.apsara-demo.svc.cluster.local
port:
number: 8000
subset: pytorch
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: style-transfer-dr
namespace: apsara-demo
spec:
host: style-transfer.apsara-demo.svc.cluster.local
subsets:
- name: tensorflow
labels:
model-format: tensorflow
- name: pytorch
labels:
model-format: pytorchApply with kubectl apply -f app-routing.yaml.
5. Integrate OIDC Single Sign‑On
Link the ASM ingress gateway with an Alibaba Cloud IDaaS OIDC application. Add a custom claim user_type (mapped to user_class) in the IDaaS console, configure the OIDC app to return this claim after login, and enable the integration via the ASM UI.
Result
After completing the steps, the demo AI application is accessible at http://{ASM‑gateway‑address}/home. The sidecarless mesh provides dynamic subset routing, JSON‑to‑gRPC transcoding, and user‑based traffic splitting without requiring sidecar injection, demonstrating how ASM can simplify AI service deployment and management.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
