How to Deploy DeepSeek LLM Locally on JD Cloud GPU with Ollama and Chatbox
Learn step‑by‑step how to prepare a JD Cloud GPU instance, install GPU drivers, deploy Ollama, run DeepSeek‑R1 models, configure graphical clients like Chatbox on Windows and macOS, and optionally feed local data using AnythingLLM to build an offline knowledge base.
1. JD Cloud GPU Host Preparation
DeepSeek models have varying hardware requirements; a dedicated GPU (e.g., NVIDIA RTX 3090 or equivalent) is recommended, and the CPU should support AVX2/AVX‑512 for optimal performance.
1.1 Create GPU Instance
Select billing mode, region, CPU architecture, Ubuntu 22.04 LTS image, and a Tesla P40 GPU. The following screenshots illustrate the process.
Configure instance type, storage, network, security group (allow port 11434), and set a password before confirming purchase.
After creation, verify the instance runs successfully.
1.2 Install GPU Driver
Connect via SSH and run the following commands on Ubuntu 22.04 to install the recommended NVIDIA driver (550) and reboot.
<code>root@deepseek-vm:~# apt update
root@deepseek-vm:~# ubuntu-drivers devices
# Identify the recommended driver, e.g., nvidia-driver-550
root@deepseek-vm:~# apt install nvidia-driver-550 -y
root@deepseek-vm:~# reboot
root@deepseek-vm:~# nvidia-smi # verify driver and GPU status</code>Sample
nvidia-smioutput shows two Tesla P40 GPUs with no memory usage.
2. Deploy Ollama
Ollama is an open‑source LLM service that simplifies local deployment.
2.1 Download Binary Package
Download the Linux amd64 tarball (v0.5.7) from the provided JD Cloud OSS link or from the official GitHub releases.
<code>root@deepseek-vm:~# cd /usr/local/src/
root@deepseek-vm:/usr/local/src# wget https://myserver.s3.cn-north-1.jdcloud-oss.com/ollama-linux-amd64.tgz</code>2.2 Install and Run Ollama
<code>root@deepseek-vm:/usr/local/src# tar -C /usr -xzf ollama-linux-amd64.tgz
root@deepseek-vm:/usr/local/src# ollama serve # start the service</code>Ollama generates a new SSH key on first run and logs startup information, including the listening address
127.0.0.1:11434.
2.3 Verify Ollama Is Running
<code>root@deepseek-vm:~# ollama -v
ollama version is 0.5.7</code>2.4 Create Systemd Service
<code># /etc/systemd/system/ollama.service
[Unit]
Description=Ollama Service
After=network-online.target
[Service]
ExecStart=/usr/bin/ollama serve
Environment="OLLAMA_HOST=0.0.0.0:11434"
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=$PATH"
[Install]
WantedBy=default.target</code>Enable and start the service:
<code>root@deepseek-vm:~# systemctl daemon-reload
root@deepseek-vm:~# systemctl enable ollama
root@deepseek-vm:~# systemctl start ollama
root@deepseek-vm:~# systemctl status ollama</code>3. Run DeepSeek Model via Ollama
Use Ollama to pull and run DeepSeek‑R1 models. The 1.5B model requires ~1.1 GB download; the 8B model requires ~4.9 GB.
<code># Run 1.5B model
root@deepseek-vm:~# ollama run deepseek-r1:1.5b
# Run 8B model
root@deepseek-vm:~# ollama run deepseek-r1:8b</code>After download, an interactive CLI appears for chatting with the model.
4. Graphical Client – Chatbox
Chatbox provides a cross‑platform GUI for interacting with Ollama‑served models.
4.1 Windows Installation
Download the Windows installer, run it, choose installation scope and path, then launch Chatbox.
4.2 Configure Chatbox
In Settings, select "Ollama API", set the endpoint to
http://<your‑host>:11434, and choose the desired DeepSeek model version.
4.3 Test Conversation
Ask sample questions such as "Are there aliens?" or "Explain Kubernetes"; Chatbox displays the model's responses.
4.4 macOS Installation
Download the macOS zip, extract, move the app to
/Applications, and trust the developer in System Settings.
Configure the same Ollama endpoint as on Windows and test the conversation.
5. Local Data Feeding with AnythingLLM
AnythingLLM can ingest local documents and use DeepSeek as the embedding model, enabling a private knowledge base.
5.1 Install AnythingLLM
Download the Docker compose file or binary from the official repository and start the service.
5.2 Basic Configuration
Set the LLM preference to DeepSeek (via Ollama), choose "Query" mode, and configure the embedder to use Ollama.
5.3 Upload Documents
Upload PDFs, TXT, Word, Excel, or PPT files; the system extracts text and indexes it.
5.4 Query the Knowledge Base
Ask questions that are answered using the uploaded documents; if the query is outside the knowledge base, the model falls back to its general knowledge.
This guide provides a complete end‑to‑end workflow for deploying DeepSeek locally, exposing it via Ollama, interacting through a graphical client, and extending its capabilities with a private document‑based knowledge base.
JD Cloud Developers
JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.