Deploy DeepSeek Locally on Ubuntu: Build Your Private AI Assistant

This guide walks through why you might run a large language model locally—privacy, zero latency, and no token costs—then details hardware requirements, installs Ollama, pulls the appropriate DeepSeek‑R1 model, tests it with a coding prompt, and optionally adds a web UI via Docker.

Ubuntu
Ubuntu
Ubuntu
Deploy DeepSeek Locally on Ubuntu: Build Your Private AI Assistant

Why Run a Large Model Locally?

Privacy: All code, documents, and conversations stay on your disk.

Zero latency: Response time depends only on your GPU, not network fluctuations.

Free: No token billing; you can call the model 24/7.

🛠️ Preparation: Check Your Hardware

Before starting, open a terminal and run nvidia-smi to verify your GPU.

Entry‑level (7B model): At least 8 GB VRAM (e.g., RTX 3060/4060) or 16 GB unified memory.

Advanced (33B/67B model): 24 GB+ VRAM (RTX 3090/4090 or multi‑GPU).

CPU‑only: Possible but expect high fan noise and slower inference.

Note: Ensure proprietary GPU drivers are installed. On Ubuntu you can use "Software & Updates → Additional Drivers".

🚀 Step 1: Install the AI Runtime – Ollama

Ollama is the de‑facto tool for running LLMs on Linux, comparable in simplicity to Docker. curl -fsSL https://ollama.com/install.sh | sh After installation, verify the service: systemctl status ollama If you see active (running), the engine is up.

📥 Step 2: Pull a DeepSeek Model

Ollama’s model library includes the DeepSeek‑R1 series, which excels at reasoning and code generation. Choose a model size that matches your VRAM:

7B (recommended for most users): Balanced speed and capability. ollama run deepseek-r1:7b 1.5B (lightweight): Fits older laptops. ollama run deepseek-r1:1.5b 32B/70B (performance beasts): For high‑end GPUs. ollama run deepseek-r1:32b Ollama will automatically download the model weights (several GB) and then drop you into an interactive prompt.

💬 Step 3: Hello, DeepSeek!

When the terminal shows the >>> prompt, you can start a conversation. For example:

>>> 请用 Python 写一个快速排序算法,并解释时间复杂度。

The model returns Python code and, using the <think> tag, displays its chain‑of‑thought reasoning, showcasing the R1 series’ strength. Exit with /bye.

🎨 Advanced: Add a Graphical UI with Open WebUI

If you prefer a ChatGPT‑like web interface or want to upload documents, run Open WebUI in Docker:

# Run Open WebUI container
docker run -d -p 3000:8080 \
  --add-host=host.docker.internal:host-gateway \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main

After the container starts, open http://localhost:3000 in a browser, register an admin account (data is stored locally), and you will see the previously downloaded deepseek-r1:7b model ready for use.

🔮 Summary

Use the AI workstation to write code, debug logs, or translate documents.

Next article will cover Retrieval‑Augmented Generation (RAG) to feed Ubuntu documentation into DeepSeek, turning it into a true Ubuntu operations expert.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

DeepSeekAI AssistantOllamaLocal LLMUbuntuOpen WebUI
Ubuntu
Written by

Ubuntu

Focused on Ubuntu/Linux tech sharing, offering the latest news, practical tools, beginner tutorials, and problem solutions. Connecting open-source enthusiasts to build a Linux learning community. Join our QQ group or channel for discussion!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.