Deploy Open-Source LLMs on Alibaba Cloud Function Compute in 10 Minutes
This guide explains how to quickly launch an open‑source large language model from ModelScope on Alibaba Cloud Function Compute, covering the required cloud services, step‑by‑step deployment, reserved‑instance configuration, and how to invoke the model via the provided domain.
Solution Overview
The solution leverages Alibaba Cloud Function Compute to deploy open‑source large models from the ModelScope community, enabling a text‑generation service without the need to own or maintain expensive GPU resources. Deployment can be completed in about ten minutes.
Technical Architecture
Function Compute: provides the serverless runtime for serving the LLM.
NAS file storage: stores the ModelScope model files.
VPC (private network): allows Function Compute to access the NAS securely.
Deploy the Application
Open the Function Compute application template, select the ModelScope model source, set Model ID to ZhipuAI/chatglm3-6b, version v1.0.2, region East China 2 (Shanghai), task type chat, and provide your ModelScope Access Token. Keep other options at default and click “Create Application”. Model download may take around 15 minutes.
After the app is created, enable idle‑reserved‑instance mode to avoid cold‑start latency. In the function details page, go to the Configuration tab, choose Reserved Instances, and create a policy with version LATEST, reserved instance count 1, and idle mode enabled.
(Optional) If no NAS is pre‑configured, the system will automatically create or attach a NAS named Alibaba-Fc-V3-Component-Generated. You can bind an existing NAS via the network and storage settings for more control.
Using the LLM Service
When deployment finishes, navigate to the Environment Details page and click the provided domain (e.g., ***.devsapp.net) to access the service. Submit a text prompt and click “Submit”. The first invocation may take ~90 seconds due to cold start; subsequent calls are much faster. The platform automatically returns idle instances to standby after each request.
Important Notes
Do not expose the demo domain publicly to avoid unexpected charges. The domain is intended for learning and testing only; for production use, bind a custom domain as described in the documentation.
Conclusion
This guide demonstrates how to launch an open‑source LLM on Alibaba Cloud Function Compute with minimal setup, leveraging ModelScope models, serverless infrastructure, and reserved‑instance configuration to achieve low‑latency, cost‑effective AI services.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
