7 min read

Deploy Open-Source LLMs on Alibaba Cloud Function Compute in 10 Minutes

This guide explains how to quickly launch an open‑source large language model from ModelScope on Alibaba Cloud Function Compute, covering the required cloud services, step‑by‑step deployment, reserved‑instance configuration, and how to invoke the model via the provided domain.

Alibaba Cloud Native

Dec 19, 2024

Deploy Open-Source LLMs on Alibaba Cloud Function Compute in 10 Minutes

Solution Overview

The solution leverages Alibaba Cloud Function Compute to deploy open‑source large models from the ModelScope community, enabling a text‑generation service without the need to own or maintain expensive GPU resources. Deployment can be completed in about ten minutes.

Technical Architecture

Function Compute: provides the serverless runtime for serving the LLM.

NAS file storage: stores the ModelScope model files.

VPC (private network): allows Function Compute to access the NAS securely.

Deploy the Application

Open the Function Compute application template, select the ModelScope model source, set Model ID to ZhipuAI/chatglm3-6b, version v1.0.2, region East China 2 (Shanghai), task type chat, and provide your ModelScope Access Token. Keep other options at default and click “Create Application”. Model download may take around 15 minutes.

After the app is created, enable idle‑reserved‑instance mode to avoid cold‑start latency. In the function details page, go to the Configuration tab, choose Reserved Instances, and create a policy with version LATEST, reserved instance count 1, and idle mode enabled.

(Optional) If no NAS is pre‑configured, the system will automatically create or attach a NAS named Alibaba-Fc-V3-Component-Generated. You can bind an existing NAS via the network and storage settings for more control.

Using the LLM Service

When deployment finishes, navigate to the Environment Details page and click the provided domain (e.g., ***.devsapp.net) to access the service. Submit a text prompt and click “Submit”. The first invocation may take ~90 seconds due to cold start; subsequent calls are much faster. The platform automatically returns idle instances to standby after each request.

Important Notes

Do not expose the demo domain publicly to avoid unexpected charges. The domain is intended for learning and testing only; for production use, bind a custom domain as described in the documentation.

Conclusion

This guide demonstrates how to launch an open‑source LLM on Alibaba Cloud Function Compute with minimal setup, leveraging ModelScope models, serverless infrastructure, and reserved‑instance configuration to achieve low‑latency, cost‑effective AI services.

serverless AI deployment Alibaba Cloud function compute ModelScope

Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.