Deploy the QwQ‑32B LLM on Alibaba Cloud Function Compute with CAP in Minutes

This guide walks you through deploying the open‑source QwQ‑32B model on Alibaba Cloud Function Compute using the Cloud Application Platform (CAP), covering architecture, required services, account setup, step‑by‑step deployment, cost considerations, model interaction via Open WebUI and Chatbox, scaling configuration, and resource cleanup.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
Deploy the QwQ‑32B LLM on Alibaba Cloud Function Compute with CAP in Minutes

Solution Overview

This guide shows how to deploy the open‑source QwQ‑32B large language model on Alibaba Cloud Function Compute (FC) using the Cloud Application Platform (CAP). Two serverless functions are created: Ollama to host the QwQ‑32B‑GGUF model and Open WebUI to provide a web UI for interactive chat.

Architecture

One CAP project that runs the Ollama and Open WebUI functions.

A NAS file system attached to the project for persisting the model files.

Architecture diagram
Architecture diagram

Prerequisites

Alibaba Cloud account (register at https://account.aliyun.com/register/qr_register.htm if you do not have one).

Enable Function Compute service in the console (https://fcnext.console.aliyun.com/) and grant the required permissions.

Deployment Procedure

1. Create CAP Project

Open the CAP console: https://cap.console.aliyun.com/create-project?template=194&from=solution.

Select the "QwQ‑32B" template, choose a region (e.g., China Beijing 2), keep default settings, and click Deploy Project .

Confirm the deployment; the provisioning takes about 10–12 minutes.

Deployed environment
Deployed environment

2. Locate Access URL

After deployment, open the CAP project details page and copy the generated HTTP endpoint.

Open the endpoint in a browser; the Open WebUI interface appears.

Example application address
Example application address

3. Test the Model

Enter a prompt such as "Who are you?" in the text box and submit. The model returns a response.

Model interaction UI
Model interaction UI

4. Adjust Ollama Scaling

In the CAP console, edit the Ollama function configuration and increase the reserved instance count to enable horizontal scaling.

Scaling configuration
Scaling configuration

5. Access via Chatbox Client

Copy the Ollama API endpoint shown in the CAP console.

Download the Chatbox client (https://chatboxai.app/zh#download) and install it (the screenshots show macOS M3).

In Chatbox, open Settings and configure:

Save the configuration.

Chatbox settings
Chatbox settings

Now you can type queries (e.g., "Who are you?" ) in Chatbox and receive responses from the deployed QwQ‑32B model.

Chatbox conversation
Chatbox conversation

Resource Cleanup

When the demo is finished, delete the CAP project to stop incurring charges:

Log in to the CAP console (https://cap.console.aliyun.com/).

Navigate to Projects , locate the deployed project, click Delete , and confirm the deletion.

Delete CAP project
Delete CAP project

Cost Considerations

The free trial quotas for Function Compute and NAS are sufficient for the steps in this guide. If the trial limits are exceeded, the estimated cost is below ¥9 per hour, but actual charges depend on the number of GPU instances and runtime duration. Always verify the final bill in the Alibaba Cloud console.

Reference URLs

Account registration: https://account.aliyun.com/register/qr_register.htm

Function Compute console: https://fcnext.console.aliyun.com/

Free trial details: https://help.aliyun.com/document_detail/2665971.html

NAS file storage: https://free.aliyun.com/?searchKey=%E6%96%87%E4%BB%B6%E5%AD%98%E5%82%A8+NAS

CAP project template: https://cap.console.aliyun.com/create-project?template=194&from=solution

Chatbox download: https://chatboxai.app/zh#download

function computeOllamaCAPOpen WebUIQwQ-32B
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.