Feb 23, 2026 · Cloud Native

Deploying Qwen 3.5 Multimodal Model on Alibaba Cloud ACK with RoleBasedGroup

This guide details how to deploy the open‑source Qwen 3.5‑397B‑A17B multimodal LLM on Alibaba Cloud ACK using the RoleBasedGroup (RBG) engine, covering model preparation, Kubernetes resources, role‑based orchestration, performance tuning, and benchmark testing.

BenchmarkingCloud Native AIKubernetes

0 likes · 24 min read

Deploying Qwen 3.5 Multimodal Model on Alibaba Cloud ACK with RoleBasedGroup

Alibaba Cloud Developer

Dec 24, 2025 · Artificial Intelligence

Boosting LLM Inference: RoleBasedGroup & Mooncake for Stable, High‑Performance Service

Large language model inference faces memory pressure, but by externalizing KVCache with Mooncake and orchestrating roles via the Kubernetes‑native RoleBasedGroup (RBG), developers can achieve stable, high‑throughput, cost‑effective serving with seamless in‑place upgrades and topology‑aware performance.

AI InfrastructureKVCacheKubernetes

0 likes · 21 min read

Boosting LLM Inference: RoleBasedGroup & Mooncake for Stable, High‑Performance Service