Deploy Qwen2.5 LLM on Alibaba Cloud Function Compute: A Step‑by‑Step Guide
This guide explains how to deploy the Qwen2.5 large language model on Alibaba Cloud Function Compute using Ollama and Open WebUI, covering model selection, resource configuration, deployment steps, interface setup, multilingual capabilities, and automatic scaling for high‑concurrency workloads.
Solution Overview
Qwen2.5 is a large‑scale language and multimodal model that excels at long‑text processing, integrates extensive domain knowledge, supports more than 29 languages, and provides strong coding and mathematics capabilities.
Deploying Qwen2.5 on Alibaba Cloud Function Compute (FC)
Function Compute offers a serverless environment with automatic scaling and flexible GPU billing (pay‑as‑you‑go, tiered, ultra‑fast). By deploying Qwen2.5 through Ollama and exposing it via Open WebUI, users can handle high‑concurrency inference while only paying for active resources.
Step 1 – Deploy the Ollama Application
Open the Ollama template URL and create a new application.
https://fcnext.console.aliyun.com/applications/create?template=ollama-qwen2_5&deployType=template-direct&from=solutionSelect the desired Qwen2.5 model size (1.5B, 3B, or 7B) from the Model Name dropdown.
Leave all other configuration items at their defaults and click Create and Deploy Default Environment . The application is deployed as shown below.
Step 2 – Deploy the Open WebUI Application
Open the Open WebUI template URL and create a new application.
https://fcnext.console.aliyun.com/applications/create?template=fc-open-webui&deployType=template-directIn Advanced Configuration > Region , select the same region used for the Ollama app.
Enable authentication for production use.
Enter the internal HTTP trigger address of the Ollama service (obtained in Step 3) as the Open WebUI endpoint.
Click Create and Deploy Default Environment . After deployment, open the provided domain to access Open WebUI.
Warning: The region selected for Open WebUI must match the region of the Ollama application.
Step 3 – Retrieve the Ollama Internal HTTP Trigger Address
Open the Function Compute console, locate the Ollama application, and open its detail page.
In the Function Resources section, click the function name to view details.
Hover over the HTTP Trigger entry and copy the displayed internal URL.
Using Open WebUI to Call Qwen2.5
After logging into Open WebUI, select Select a model and choose the deployed Qwen2.5 model. You can then interact with the model via the chat box.
Example capabilities:
Multilingual response (e.g., self‑introduction in French).
Enhanced coding and mathematics reasoning thanks to domain‑expert fine‑tuning.
Document summarization: upload a local file (e.g., 百炼手机详细参数.docx ) and prompt “Summarize document content”. The model extracts key information.
Configuring Open WebUI Language to Simplified Chinese
Click the settings icon (top‑right) and choose Settings .
Navigate to General > Language and select Chinese(简体中文) .
Save the changes; the interface refreshes in Chinese.
Automatic Scaling of Function Compute
FC automatically scales the number of Ollama function instances based on request volume. When traffic spikes, new instances are created; idle instances are terminated after 3–5 minutes, reducing cost while maintaining low latency.
Related Links
Ollama template URL:
https://fcnext.console.aliyun.com/applications/create?template=ollama-qwen2_5&deployType=template-direct&from=solutionOpen WebUI template URL:
https://fcnext.console.aliyun.com/applications/create?template=fc-open-webui&deployType=template-directFunction Compute console: https://fcnext.console.aliyun.com/applications Sample document for summarization:
https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20240701/geijms/%E7%99%BE%E7%82%BC%E7%B3%BB%E5%88%97%E6%89%8B%E6%9C%BA%E4%BA%A7%E5%93%81%E4%BB%8B%E7%BB%8D.docxSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
