Artificial Intelligence 8 min read

Deploy DeepSeek‑R1 on Tencent Cloud with Ollama: A Complete Step‑by‑Step Guide

This guide walks you through preparing a Tencent Cloud account, creating a Cloud Studio workspace, installing Ollama, downloading and running the DeepSeek‑R1 large language model, interacting via terminal or API, and managing resources and model versions.

Full-Stack DevOps & Kubernetes

Feb 8, 2025

Deploy DeepSeek‑R1 on Tencent Cloud with Ollama: A Complete Step‑by‑Step Guide

The article provides a detailed tutorial for deploying the DeepSeek‑R1 large language model on Tencent Cloud using Ollama. It lists prerequisites, explains how to create a Cloud Studio workspace, install Ollama, download the model, interact with it, call it via API, and stop the service, while noting resource limits and version choices.

Prerequisites

A verified Tencent Cloud account with sufficient balance.

Ollama installed and configured locally.

Step‑by‑Step Deployment

Open the Tencent Cloud portal at https://cloud.tencent.com/ and complete real‑name verification.

Create a Cloud Studio workspace.

Navigate to the Cloud Studio dashboard: https://ide.cloud.tencent.com/dashboard/gpuworkspace.

Click “Create Workspace”.

Select the “ollama” template.

Confirm the configuration; each month you receive 10,000 free minutes (≈166.7 hours).

After the workspace is created, wait for the instance to start and then click “Enter Workspace”.

In the VSCode‑like interface, open the terminal.

Install Ollama by running: curl -fsSL https://ollama.com/install.sh | sh Verify the installation: ollama --version Download and start the DeepSeek‑R1 model (8B parameters) with: ollama run deepseek-r1:8b You can choose other model sizes, e.g., 14B or 70B, by changing the tag (deepseek-r1:14b, deepseek-r1:70b).

Once the model finishes downloading, Ollama launches it in interactive mode; you can type questions directly in the terminal.

Example interaction: 你好，DeepSeek-R1！ Response:

你好！我是 DeepSeek-R1，很高兴为你服务。

To call the model via API, use the default port 6399. Example with curl:

curl http://0.0.0.0:6399/api/chat -d '{
  "model": "deepseek-r1",
  "messages": [{ "role": "user", "content": "你好，DeepSeek-R1！" }]
}'

Python example:

import requests
response = requests.post(
    "http://0.0.0.0:6399/api/chat",
    json={"model": "deepseek-r1", "messages": [{"role": "user", "content": "你好，DeepSeek-R1！"}]}
)
print(response.json())

To stop the model, run: ollama stop Or exit interactive mode with /bye.

Important Notes

Resource management: the free 10,000 minutes per month equals about one week of 24‑hour usage; shut down the workspace when finished.

Model version selection: choose a smaller model (e.g., deepseek-r1:1.5b) if GPU resources are limited.

Restarting the Model Later

Log into the Tencent Cloud instance and open the terminal.

Ensure Ollama is installed (install again if necessary with curl -sSL https://ollama.com/install.sh | sh).

Navigate to the Ollama directory (default /opt/ollama) with cd /opt/ollama.

Start the desired model, e.g., ollama run deepseek-r1 or ollama run deepseek-r1:14b for a larger version.

Interact as before by typing queries directly in the terminal.

Sample Advanced Interaction

Ask the model to generate code, such as a Python function to add two numbers:

请生成一个 Python 函数，用于计算两个数的和。

The model replies with:

def add(a, b):
    return a + b

LLM API DeepSeek GPU Tencent Cloud AI Model Deployment Ollama

Written by

Full-Stack DevOps & Kubernetes

Focused on sharing DevOps, Kubernetes, Linux, Docker, Istio, microservices, Spring Cloud, Python, Go, databases, Nginx, Tomcat, cloud computing, and related technologies.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.