Deploy DeepSeek on JD Cloud GPU and Chat with It via Ollama & Chatbox

This guide walks you through preparing a JD Cloud GPU instance, installing NVIDIA drivers, deploying Ollama, running the DeepSeek LLM (including model download and execution), configuring the Chatbox graphical client for interactive queries, and optionally feeding local documents into AnythingLLM for a private knowledge base.

JD Tech Talk
JD Tech Talk
JD Tech Talk
Deploy DeepSeek on JD Cloud GPU and Chat with It via Ollama & Chatbox

1. Prepare GPU instance on JD Cloud

Requirements: NVIDIA RTX 3090‑class or Tesla P40 GPU, CPU with AVX2/AVX‑512, at least 16 GB RAM, Ubuntu 22.04 LTS.

Create a GPU instance (Tesla P40) in the JD Cloud console, open port 11434 in the security group, set a root password, and launch the VM.

2. Install NVIDIA driver

apt update
ubuntu-drivers devices
apt install nvidia-driver-550 -y
reboot
nvidia-smi

Verify that nvidia-smi lists the Tesla P40 GPUs and driver version 550.xx with no running processes.

3. Deploy Ollama

3.1 Download binary

cd /usr/local/src
wget https://myserver.s3.cn-north-1.jdcloud-oss.com/ollama-linux-amd64.tgz
tar -C /usr -xzf ollama-linux-amd64.tgz

3.2 Create systemd service

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=/usr/bin/ollama serve
Environment="OLLAMA_HOST=0.0.0.0:11434"
User=ollama
Group=ollama
Restart=always
RestartSec=3

[Install]
WantedBy=default.target
useradd -r -s /bin/false -U -m -d /usr/share/ollama ollama
usermod -a -G ollama root
systemctl daemon-reload
systemctl enable ollama
systemctl start ollama
systemctl status ollama

Confirm the process is listening on 0.0.0.0:11434 (e.g., ss -ltnp).

4. Run DeepSeek model via Ollama

Pull the desired model. Example for the 8‑billion‑parameter version (≈4.9 GB): ollama run deepseek-r1:8b For a lighter 1.5 B version (≈1.1 GB) use deepseek-r1:1.5b. After download the command opens an interactive CLI where queries can be typed.

5. Access the model with a graphical client (Chatbox)

5.1 Install Chatbox

Download the installer from https://chatboxai.app/zh#download (Windows .exe or macOS .zip) and run the installer.

5.2 Configure connection

In Chatbox Settings → API select “Ollama API” and set the endpoint to http://your-host:11434. Choose the model name (e.g., deepseek-r1:8b) and optionally increase the context window.

5.3 Test queries

Start a conversation and ask any question; the responses confirm that the DeepSeek model is served correctly.

6. Optional: Build a local knowledge base with AnythingLLM

6.1 Install AnythingLLM

git clone https://github.com/Mintplex-Labs/anything-llm
cd anything-llm
# Docker example
docker compose up -d
# or npm
npm install
npm start

6.2 Connect to Ollama

In the AnythingLLM UI set the LLM provider to “Ollama” and point the endpoint to http://your-host:11434. Set the embedder to Ollama as well.

6.3 Upload documents

Upload PDFs, Word, Excel, or PPT files. AnythingLLM extracts text, creates embeddings with the DeepSeek model, and stores them for retrieval‑augmented generation.

6.4 Query the knowledge base

Ask questions that require information from the uploaded documents; the model will prioritize the local knowledge base over generic responses.

DeepSeekOllamaAnythingLLMChatboxLLM deploymentGPU cloud
JD Tech Talk
Written by

JD Tech Talk

Official JD Tech public account delivering best practices and technology innovation.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.