Build a RAG Vector Database with DeepSeek on a Cloud Host – Step‑by‑Step Guide
This tutorial explains how to deploy the DeepSeek‑r1:1.5b model on a cloud server using Ollama, create a retrieval‑augmented generation (RAG) vector database with the mxbai‑embed‑large embedding model, and build an interactive AI application that answers questions from uploaded PDFs.
RAG (Retrieval‑Augmented Generation) combines information retrieval with generative AI to improve large language model (LLM) answer accuracy.
In this tutorial we deploy the DeepSeek‑r1:1.5b model on a cloud host using Ollama, build a vector database with the mxbai‑embed‑large embedding model, and create a RAG‑enabled application.
Case Overview
Estimated duration: 60 minutes.
Steps
Install Ollama on the cloud host.
Use Ollama to pull and run the DeepSeek model and the mxbai‑embed‑large embedding model.
Clone the project code and retrieve the DeepSeek model locally.
Upload the dataset and build the RAG vector database.
Install Ollama
Run the following command in the cloud host terminal:
curl -fsSL https://ollama.com/install.sh | shDeploy DeepSeek Model
Pull and run the model with Ollama:
ollama run deepseek-r1:1.5bCreate Virtual Environment
Open CodeArts IDE for Python, create a new project named “RAG”, enable the “active” setting, and open a terminal where the virtual environment is activated (indicated by (venv)).
Build RAG Vector Database
Clone the repository, enter the directory, and install dependencies:
git clone https://github.com/paquino11/chatpdf-rag-deepseek-r1 cd chatpdf-rag-deepseek-r1 pip install -r requirements.txtConfigure RAG Application
Modify rag.py to set the default LLM and embedding models:
def __init__(self, llm_model: str = "deepseek-r1:1.5b", embedding_model: str = "mxbai-embed-large"):Run the Application
Start the Streamlit UI: streamlit run app.py The browser opens an interface where users can upload PDF documents, adjust retrieval settings (number of results, similarity threshold), view chat history, and ask questions.
Example Query
After uploading a PDF containing AI fundamentals, ask “What are the core techniques of machine learning?”; the system returns a concise answer retrieved from the vector store.
The tutorial ends after demonstrating a successful RAG query.
Huawei Cloud Developer Alliance
The Huawei Cloud Developer Alliance creates a tech sharing platform for developers and partners, gathering Huawei Cloud product knowledge, event updates, expert talks, and more. Together we continuously innovate to build the cloud foundation of an intelligent world.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
