Artificial Intelligence 13 min read

Deploy and Test DeepSeek Large Language Models on Tencent Cloud TI in Minutes

This guide walks you through quickly deploying DeepSeek series models on the Tencent Cloud TI platform, covering model selection, resource planning, step‑by‑step service creation, free online trial, API testing via built‑in tools or curl, and managing inference services for both large and compact models.

Tencent Tech
Tencent Tech
Tencent Tech
Deploy and Test DeepSeek Large Language Models on Tencent Cloud TI in Minutes

Quick Deployment of DeepSeek Models on Tencent Cloud TI

This article explains how to use the Tencent Cloud TI platform to rapidly deploy the DeepSeek family of large language models, enabling interactive chat experiences or API‑based integration into AI applications.

DeepSeek Model Overview

DeepSeek‑V3 : A 671‑billion‑parameter Mixture‑of‑Experts model pretrained on 14.8 trillion high‑quality tokens, excelling in knowledge Q&A, content generation, and intelligent客服.

DeepSeek‑R1 : A high‑performance inference model derived from DeepSeek‑V3‑Base, strong in mathematics, code generation, and logical reasoning.

DeepSeek‑R1‑Distill : A distilled version of DeepSeek‑R1 with fewer parameters, lower inference cost, and comparable benchmark performance.

All models are available in the TI model marketplace (see image below). For HCCPNV6 instances, contact your Tencent Cloud sales or pre‑sales architect.

Limited Free Trial

The platform offers a time‑limited free online experience for DeepSeek‑R1 and DeepSeek‑R1‑Distill‑Qwen‑1.5B, allowing developers to compare the performance of the “largest cup” and “smallest cup” models directly.

Model Deployment Practice

We demonstrate deployment using the smallest model, DeepSeek‑R1‑Distill‑Qwen‑1.5B . Other models follow the same workflow, with adjustments to compute resources.

Prerequisites

Model : Select a model from the TI “Large Model Square”.

Resources : The 1.5 B model runs on a single mid‑range GPU. Billing options include pay‑as‑you‑go (recommended for short trials) or monthly subscription (requires pre‑purchased CVM instances).

Note : Deploying DeepSeek‑R1 or V3 requires HCCPNV6 instances, which must be enabled by your sales contact.

Step 1: Deploy Model Service

Log in to the Tencent Cloud TI console and open the “Large Model Square”.

Click the “DeepSeek Series Models” card to view details.

Choose “Create Online Service” and navigate to the service creation page.

Configure the service:

Service name (e.g., demo‑DeepSeek‑R1‑Distill‑Qwen‑1_5B).

Machine source: “Purchase from TIONE – Pay‑as‑you‑go” (or select a CVM instance).

Deployment type: “Standard Deployment”.

Service instance: select “Image”.

Model source: “Built‑in Model / DeepSeek‑R1‑Distill‑Qwen‑1.5B”.

Compute spec: a single mid‑range GPU (see resource guide).

Authorize the service agreement and click “Start Service”.

Step 2: Experience Model Output

After deployment, the service status shows “Running”. Use the “Online Experience” button to interact with the model via a web UI.

Step 3: Call the Inference API

You can test the API using the TI built‑in tool or a curl command. Example request body:

<code>{
    "model": "ms-xxxxxxxx",
    "messages": [
        {
            "role": "user",
            "content": "Describe your understanding of artificial intelligence."
        }
    ]
}</code>

Replace

ms-xxxxxxxx

with the service group ID (the string after the “调用地址” URL, prefixed with “ms-”).

Example curl command:

<code>curl -X POST https://ms-xxxxxxxx-xxxxxxxx.gw.ap-shanghai.ti.tencentcs.com/ms-xxxxxxxx/v1/chat/completions \
  -H 'Content-Type: application/json' -d'{
    "model": "ms-xxxxxxxx",
    "messages": [
        {"role": "user", "content": "Describe your understanding of artificial intelligence."}
    ]
}'</code>

Step 4: Manage the Inference Service

Visit “Model Service > Online Service > Service Details” to stop, restart, delete, view logs, monitor metrics, and update configurations.

Deployment Tips for Different Models

For larger models (DeepSeek‑R1, V3), allocate appropriate GPU resources as described in the “Large Model Inference Resource Guide”. HCCPNV6 instances are required and must be provisioned through your sales contact.

Size Comparison of Models

Using the deployed DeepSeek‑R1‑Distill‑Qwen‑1.5B and DeepSeek‑R1 , we asked the same question about a ball’s location after moving a cup. The larger DeepSeek‑R1 correctly inferred that the ball fell out of the inverted cup, while the smaller model incorrectly kept the ball inside the cup. The smaller model, however, responded faster and consumed fewer resources.

Deployment times: ~1‑2 minutes for the distilled model, ~9‑10 minutes for DeepSeek‑R1 (due to model loading).

Click the “Read Original Article” link to access the full TI‑ONE tutorial.

Model DeploymentDeepSeeklarge language modelAI inferenceTencent Cloud
Tencent Tech
Written by

Tencent Tech

Tencent's official tech account. Delivering quality technical content to serve developers.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.