Deploy DeepSeek Locally on Ubuntu: Build Your Private AI Assistant
This guide walks through why you might run a large language model locally—privacy, zero latency, and no token costs—then details hardware requirements, installs Ollama, pulls the appropriate DeepSeek‑R1 model, tests it with a coding prompt, and optionally adds a web UI via Docker.
Why Run a Large Model Locally?
Privacy: All code, documents, and conversations stay on your disk.
Zero latency: Response time depends only on your GPU, not network fluctuations.
Free: No token billing; you can call the model 24/7.
🛠️ Preparation: Check Your Hardware
Before starting, open a terminal and run nvidia-smi to verify your GPU.
Entry‑level (7B model): At least 8 GB VRAM (e.g., RTX 3060/4060) or 16 GB unified memory.
Advanced (33B/67B model): 24 GB+ VRAM (RTX 3090/4090 or multi‑GPU).
CPU‑only: Possible but expect high fan noise and slower inference.
Note: Ensure proprietary GPU drivers are installed. On Ubuntu you can use "Software & Updates → Additional Drivers".
🚀 Step 1: Install the AI Runtime – Ollama
Ollama is the de‑facto tool for running LLMs on Linux, comparable in simplicity to Docker. curl -fsSL https://ollama.com/install.sh | sh After installation, verify the service: systemctl status ollama If you see active (running), the engine is up.
📥 Step 2: Pull a DeepSeek Model
Ollama’s model library includes the DeepSeek‑R1 series, which excels at reasoning and code generation. Choose a model size that matches your VRAM:
7B (recommended for most users): Balanced speed and capability. ollama run deepseek-r1:7b 1.5B (lightweight): Fits older laptops. ollama run deepseek-r1:1.5b 32B/70B (performance beasts): For high‑end GPUs. ollama run deepseek-r1:32b Ollama will automatically download the model weights (several GB) and then drop you into an interactive prompt.
💬 Step 3: Hello, DeepSeek!
When the terminal shows the >>> prompt, you can start a conversation. For example:
>>> 请用 Python 写一个快速排序算法,并解释时间复杂度。The model returns Python code and, using the <think> tag, displays its chain‑of‑thought reasoning, showcasing the R1 series’ strength. Exit with /bye.
🎨 Advanced: Add a Graphical UI with Open WebUI
If you prefer a ChatGPT‑like web interface or want to upload documents, run Open WebUI in Docker:
# Run Open WebUI container
docker run -d -p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:mainAfter the container starts, open http://localhost:3000 in a browser, register an admin account (data is stored locally), and you will see the previously downloaded deepseek-r1:7b model ready for use.
🔮 Summary
Use the AI workstation to write code, debug logs, or translate documents.
Next article will cover Retrieval‑Augmented Generation (RAG) to feed Ubuntu documentation into DeepSeek, turning it into a true Ubuntu operations expert.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ubuntu
Focused on Ubuntu/Linux tech sharing, offering the latest news, practical tools, beginner tutorials, and problem solutions. Connecting open-source enthusiasts to build a Linux learning community. Join our QQ group or channel for discussion!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
