Build a RAG Vector Database with DeepSeek on a Cloud Host – Step‑by‑Step Guide

This tutorial explains how to deploy the DeepSeek‑r1:1.5b model on a cloud server using Ollama, create a retrieval‑augmented generation (RAG) vector database with the mxbai‑embed‑large embedding model, and build an interactive AI application that answers questions from uploaded PDFs.

Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Build a RAG Vector Database with DeepSeek on a Cloud Host – Step‑by‑Step Guide

RAG (Retrieval‑Augmented Generation) combines information retrieval with generative AI to improve large language model (LLM) answer accuracy.

In this tutorial we deploy the DeepSeek‑r1:1.5b model on a cloud host using Ollama, build a vector database with the mxbai‑embed‑large embedding model, and create a RAG‑enabled application.

Case Overview

Estimated duration: 60 minutes.

Steps

Install Ollama on the cloud host.

Use Ollama to pull and run the DeepSeek model and the mxbai‑embed‑large embedding model.

Clone the project code and retrieve the DeepSeek model locally.

Upload the dataset and build the RAG vector database.

Install Ollama

Run the following command in the cloud host terminal:

curl -fsSL https://ollama.com/install.sh | sh

Deploy DeepSeek Model

Pull and run the model with Ollama:

ollama run deepseek-r1:1.5b

Create Virtual Environment

Open CodeArts IDE for Python, create a new project named “RAG”, enable the “active” setting, and open a terminal where the virtual environment is activated (indicated by (venv)).

Build RAG Vector Database

Clone the repository, enter the directory, and install dependencies:

git clone https://github.com/paquino11/chatpdf-rag-deepseek-r1
cd chatpdf-rag-deepseek-r1
pip install -r requirements.txt

Configure RAG Application

Modify rag.py to set the default LLM and embedding models:

def __init__(self, llm_model: str = "deepseek-r1:1.5b", embedding_model: str = "mxbai-embed-large"):

Run the Application

Start the Streamlit UI: streamlit run app.py The browser opens an interface where users can upload PDF documents, adjust retrieval settings (number of results, similarity threshold), view chat history, and ask questions.

Example Query

After uploading a PDF containing AI fundamentals, ask “What are the core techniques of machine learning?”; the system returns a concise answer retrieved from the vector store.

The tutorial ends after demonstrating a successful RAG query.

AIRAGvector databaseDeepSeekRetrieval-Augmented GenerationOllama
Huawei Cloud Developer Alliance
Written by

Huawei Cloud Developer Alliance

The Huawei Cloud Developer Alliance creates a tech sharing platform for developers and partners, gathering Huawei Cloud product knowledge, event updates, expert talks, and more. Together we continuously innovate to build the cloud foundation of an intelligent world.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.