Deploy DeepSeek‑R1 on Tencent Cloud with Ollama: A Complete Step‑by‑Step Guide
This guide walks you through preparing a Tencent Cloud account, creating a Cloud Studio workspace, installing Ollama, downloading and running the DeepSeek‑R1 large language model, interacting via terminal or API, and managing resources and model versions.
The article provides a detailed tutorial for deploying the DeepSeek‑R1 large language model on Tencent Cloud using Ollama. It lists prerequisites, explains how to create a Cloud Studio workspace, install Ollama, download the model, interact with it, call it via API, and stop the service, while noting resource limits and version choices.
Prerequisites
A verified Tencent Cloud account with sufficient balance.
Ollama installed and configured locally.
Step‑by‑Step Deployment
Register and log in to Tencent Cloud.
Open the Tencent Cloud portal at https://cloud.tencent.com/ and complete real‑name verification.
Create a Cloud Studio workspace.
Navigate to the Cloud Studio dashboard: https://ide.cloud.tencent.com/dashboard/gpuworkspace.
Click “Create Workspace”.
Select the “ollama” template.
Confirm the configuration; each month you receive 10,000 free minutes (≈166.7 hours).
After the workspace is created, wait for the instance to start and then click “Enter Workspace”.
In the VSCode‑like interface, open the terminal.
Install Ollama by running: curl -fsSL https://ollama.com/install.sh | sh Verify the installation: ollama --version Download and start the DeepSeek‑R1 model (8B parameters) with: ollama run deepseek-r1:8b You can choose other model sizes, e.g., 14B or 70B, by changing the tag (deepseek-r1:14b, deepseek-r1:70b).
Once the model finishes downloading, Ollama launches it in interactive mode; you can type questions directly in the terminal.
Example interaction: 你好,DeepSeek-R1! Response:
你好!我是 DeepSeek-R1,很高兴为你服务。To call the model via API, use the default port 6399. Example with curl:
curl http://0.0.0.0:6399/api/chat -d '{
"model": "deepseek-r1",
"messages": [{ "role": "user", "content": "你好,DeepSeek-R1!" }]
}'Python example:
import requests
response = requests.post(
"http://0.0.0.0:6399/api/chat",
json={"model": "deepseek-r1", "messages": [{"role": "user", "content": "你好,DeepSeek-R1!"}]}
)
print(response.json())To stop the model, run: ollama stop Or exit interactive mode with /bye.
Important Notes
Resource management: the free 10,000 minutes per month equals about one week of 24‑hour usage; shut down the workspace when finished.
Model version selection: choose a smaller model (e.g., deepseek-r1:1.5b) if GPU resources are limited.
Restarting the Model Later
Log into the Tencent Cloud instance and open the terminal.
Ensure Ollama is installed (install again if necessary with curl -sSL https://ollama.com/install.sh | sh).
Navigate to the Ollama directory (default /opt/ollama) with cd /opt/ollama.
Start the desired model, e.g., ollama run deepseek-r1 or ollama run deepseek-r1:14b for a larger version.
Interact as before by typing queries directly in the terminal.
Sample Advanced Interaction
Ask the model to generate code, such as a Python function to add two numbers:
请生成一个 Python 函数,用于计算两个数的和。The model replies with:
def add(a, b):
return a + bFull-Stack DevOps & Kubernetes
Focused on sharing DevOps, Kubernetes, Linux, Docker, Istio, microservices, Spring Cloud, Python, Go, databases, Nginx, Tomcat, cloud computing, and related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
