Cloud Computing 10 min read

Deploy Qwen2.5 LLM on Alibaba Cloud Function Compute: A Step‑by‑Step Guide

This guide explains how to deploy the Qwen2.5 large language model on Alibaba Cloud Function Compute using Ollama and Open WebUI, covering model selection, resource configuration, deployment steps, interface setup, multilingual capabilities, and automatic scaling for high‑concurrency workloads.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
Deploy Qwen2.5 LLM on Alibaba Cloud Function Compute: A Step‑by‑Step Guide

Solution Overview

Qwen2.5 is a large‑scale language and multimodal model that excels at long‑text processing, integrates extensive domain knowledge, supports more than 29 languages, and provides strong coding and mathematics capabilities.

Deploying Qwen2.5 on Alibaba Cloud Function Compute (FC)

Function Compute offers a serverless environment with automatic scaling and flexible GPU billing (pay‑as‑you‑go, tiered, ultra‑fast). By deploying Qwen2.5 through Ollama and exposing it via Open WebUI, users can handle high‑concurrency inference while only paying for active resources.

Step 1 – Deploy the Ollama Application

Open the Ollama template URL and create a new application.

https://fcnext.console.aliyun.com/applications/create?template=ollama-qwen2_5&deployType=template-direct&from=solution

Select the desired Qwen2.5 model size (1.5B, 3B, or 7B) from the Model Name dropdown.

Leave all other configuration items at their defaults and click Create and Deploy Default Environment . The application is deployed as shown below.

Ollama deployment screenshot
Ollama deployment screenshot

Step 2 – Deploy the Open WebUI Application

Open the Open WebUI template URL and create a new application.

https://fcnext.console.aliyun.com/applications/create?template=fc-open-webui&deployType=template-direct

In Advanced Configuration > Region , select the same region used for the Ollama app.

Enable authentication for production use.

Enter the internal HTTP trigger address of the Ollama service (obtained in Step 3) as the Open WebUI endpoint.

Click Create and Deploy Default Environment . After deployment, open the provided domain to access Open WebUI.

Open WebUI deployment screenshot
Open WebUI deployment screenshot
Warning: The region selected for Open WebUI must match the region of the Ollama application.

Step 3 – Retrieve the Ollama Internal HTTP Trigger Address

Open the Function Compute console, locate the Ollama application, and open its detail page.

In the Function Resources section, click the function name to view details.

Hover over the HTTP Trigger entry and copy the displayed internal URL.

Internal HTTP trigger address
Internal HTTP trigger address

Using Open WebUI to Call Qwen2.5

After logging into Open WebUI, select Select a model and choose the deployed Qwen2.5 model. You can then interact with the model via the chat box.

Example capabilities:

Multilingual response (e.g., self‑introduction in French).

Enhanced coding and mathematics reasoning thanks to domain‑expert fine‑tuning.

Document summarization: upload a local file (e.g., 百炼手机详细参数.docx ) and prompt “Summarize document content”. The model extracts key information.

Configuring Open WebUI Language to Simplified Chinese

Click the settings icon (top‑right) and choose Settings .

Navigate to General > Language and select Chinese(简体中文) .

Save the changes; the interface refreshes in Chinese.

Language setting screenshot
Language setting screenshot

Automatic Scaling of Function Compute

FC automatically scales the number of Ollama function instances based on request volume. When traffic spikes, new instances are created; idle instances are terminated after 3–5 minutes, reducing cost while maintaining low latency.

Scaling diagram
Scaling diagram

Related Links

Ollama template URL:

https://fcnext.console.aliyun.com/applications/create?template=ollama-qwen2_5&deployType=template-direct&from=solution

Open WebUI template URL:

https://fcnext.console.aliyun.com/applications/create?template=fc-open-webui&deployType=template-direct

Function Compute console: https://fcnext.console.aliyun.com/applications Sample document for summarization:

https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20240701/geijms/%E7%99%BE%E7%82%BC%E7%B3%BB%E5%88%97%E6%89%8B%E6%9C%BA%E4%BA%A7%E5%93%81%E4%BB%8B%E7%BB%8D.docx
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

cloud computingFunction ComputeAI Model DeploymentOllamaQwen2.5Open WebUI
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.