Deploy Alibaba's Qwen3.5-397B Model in Minutes with Serverless Function Compute
This guide explains how to quickly deploy the new Qwen3.5-397B-A17B open‑source large model using Alibaba Cloud Function Compute's serverless GPU service, covering model features, deployment steps, required commands, and performance benefits.
Model Overview
Alibaba open‑sourced Qwen3.5‑397B‑A17B, a 397 billion‑parameter large language model that uses only 17 billion active parameters thanks to a hybrid linear‑attention (Gated Delta Networks) and Mixture‑of‑Experts (MoE) architecture. The model supports 201 languages and excels in vision‑language, code generation, and autonomous‑agent tasks.
Why traditional deployment is difficult
Complex GPU environment configuration.
Labor‑intensive monitoring and maintenance.
Hard to achieve elastic scaling.
Serverless Function Compute (FC) solution
Function Compute provides a serverless GPU service for Qwen3.5, eliminating infrastructure management. Benefits:
One‑click deployment reduces integration time from days to ~5 minutes.
Memory usage drops ~60 % and inference throughput can increase up to 19×.
Automatic scaling with low operational overhead.
Step‑by‑step deployment
Create an OSS bucket and download the model files to a directory, e.g. Qwen/Qwen3.5-397B-A17B.
Deploy the white‑screen tool (template ID 283) and wait until it is ready.
In the FunModel custom deployment console, select the Serverless GPU image, configure resources, and set the startup command:
vllm serve /mnt/my-model-scope/models/Qwen/Qwen3.5-397B-A17B \
--served-model-name Qwen/Qwen3.5-397B-A17B \
--port 9000 \
--trust-remote-code \
--gpu-memory-utilization 0.9 \
--max-model-len 262144 \
--tensor-parallel-size 16 \
--enable-auto-tool-choice \
--tool-call-parser qwen3_coder \
--reasoning-parser qwen3Start the deployment, wait for the service to become available, then run inference tests.
Performance comparison
Deployment time: traditional ≈ days, FC ≈ 5 minutes.
Technical barrier: high vs. low.
Ops & iteration cost: high vs. low.
Relevant URLs
FunModel quick start: https://fun-model-docs.devsapp.net/getting-started/
Custom model deployment guide: https://fun-model-docs.devsapp.net/user-guide/custom-model-deployment/
White‑screen tool template: https://functionai.console.aliyun.com/old/template-detail?template=283
FunModel custom deployment console: https://functionai.console.aliyun.com/fun-model/cn-hangzhou/custom-model-create
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
