Artificial Intelligence 9 min read

How to Deploy DeepSeek‑R1 Locally with Ollama and Dify: A Step‑by‑Step Guide

This article walks through the entire process of deploying the DeepSeek‑R1 large language model on a personal machine, covering hardware requirements, Ollama installation, model download, service startup, remote access configuration, and visual UI integration with Dify, complete with concrete commands and screenshots.

Qborfy AI

Mar 27, 2025

How to Deploy DeepSeek‑R1 Locally with Ollama and Dify: A Step‑by‑Step Guide

1. Understanding DeepSeek‑R1

The author begins by summarizing DeepSeek’s positioning: DeepSeek‑R1 (released 2025‑01‑20) is a 671B‑parameter model with a 128K context window, marketed as comparable to OpenAI‑o1. Two versions exist: a full‑size model and distilled versions ranging from 1.5B to 70B parameters. Because the full model demands extreme hardware, the 32B distilled variant is chosen for the tutorial.

2. Preparation

2.1 Hardware

The author lists a personal workstation used for the demo: 32 GB RAM, Tesla T4 16 GB GPU, 32‑core CPU, tLinux 3.1 OS, and a 1 TB SSD. Minimum requirements for other platforms are also enumerated (e.g., Windows GTX 1650 4 GB, Linux GTX 1660 6 GB, Mac M2 Air 8 GB).

2.2 Software

Ollama is introduced as an open‑source local LLM runtime written in Go, analogous to Docker for model serving. Installation on Linux is performed with a single curl command: curl -fsSL https://ollama.com/install.sh | sh Verification is done via ollama -h. The model is then pulled from Ollama’s registry:

# Download model
ollama pull deepseek-r1:32b

Downloading may take several hours depending on network speed, as shown by a screenshot of the progress bar.

3. Deploying the Model

With the model cached, the service is started:

# Start Ollama API service
ollama serve --
# Run the model
ollama run deepseek-r1:32b

Testing the endpoint uses curl:

curl http://localhost:11434/api/generate -d '{
  "model": "deepseek-r1:32b",
  "prompt":"Why is the sky blue?"
}'

The response contains the generated answer, confirming the model is operational.

To allow other machines to reach the service, the systemd unit file is edited to set OLLAMA_HOST=0.0.0.0:11434, followed by systemctl restart ollama. The author also shows how to stop/start the service with systemctl.

4. Adding a Visual UI with Dify

Dify (GitHub: https://github.com/langgenius/dify) is recommended for a web UI. Installation uses Docker Compose:

# Clone and start Dify
git clone https://github.com/langgenius/dify.git
cd dify/docker
cp .env.example .env
docker-compose up -d

After the containers are up, the user accesses http://localhost/install to create an admin account. In the Dify dashboard, the model supplier is set to Ollama and the model name to DeepSeek-32B, as illustrated by a screenshot.

Finally, a new “Chat Assistant” application is created, the DeepSeek‑32B model is selected, and the chat interface is demonstrated with screenshots showing successful responses and reasonable latency.

The author notes additional Dify features such as knowledge‑base ingestion for building AI‑powered customer service bots, and promises further AI‑related tutorials.

Docker AI DeepSeek Ollama LLM deployment local inference

Written by

Qborfy AI

A knowledge base that logs daily experiences and learning journeys, sharing them with you to grow together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.