Artificial Intelligence 9 min read

Build a RAG‑Powered Smart Q&A Assistant with Milvus, DeepSeek, and PAI LangStudio

This step‑by‑step guide shows how to assemble a Retrieval‑Augmented Generation (RAG) system using Alibaba Cloud Milvus vector search, the DeepSeek large language model, and PAI LangStudio, covering instance creation, data upload, model deployment, connection setup, flow design, and service invocation.

Alibaba Cloud Big Data AI Platform

Feb 25, 2025

Build a RAG‑Powered Smart Q&A Assistant with Milvus, DeepSeek, and PAI LangStudio

Why struggle with massive data but no real "intelligence"? This guide shows how to quickly assemble a Retrieval‑Augmented Generation (RAG) system using Alibaba Cloud Milvus vector search, DeepSeek large language model, and PAI LangStudio.

Create Milvus instance

Follow the official guide to create a Milvus instance; ensure the instance and later services are in the same region.

Upload knowledge base to OSS

Provide example corpora for finance (PDF news) and medical (CSV disease descriptions) with the following links:

Financial news PDF: https://x.sm.cn/3aqO6uZ

Disease introduction CSV: https://x.sm.cn/1k0Z4Rv

Deploy DeepSeek and Embedding models

In the PAI console, go to Quick Start > ModelGallery , select the DeepSeek‑R1‑Distill‑Qwen‑7B model and the bge‑m3 embedding model, and deploy them.

Obtain the VPC access address and token for each deployed service.

Create connections

In LangStudio, create a new LLM service connection (EAS service) and fill in base_url and api_key with the values from the deployed DeepSeek model.

Similarly create an Embedding service connection using the corresponding base_url and api_key from the embedding model.

Create vector database connection

Create a Milvus database connection. Key parameters:

uri : Milvus instance address, e.g., http://<Milvus internal address> token : login credentials in the form <username>:<password> database : database name, default is

default

Create offline knowledge base

Build a knowledge‑base index that parses, chunks, and vectorizes the corpus into Milvus. Detailed configuration is available in the official documentation.

Build and run the RAG application flow

In LangStudio, create a new application flow using the RAG template. Configure the following key nodes:

index_lookup : set registered_index (the created knowledge‑base index), query (user question), and top_k (number of results).

generate_answer : select the LLM service connection, set model to default, and define max_tokens (e.g., 1000).

Start the runtime, debug, and run the flow.

Debug/run the flow.

View the trace or topology via “View Trace”.

Deploy the application flow

Deploy with default settings; configure instance count (e.g., 1 for testing) and VPC to match the Milvus instance.

Call the service for conversation

After deployment, use PAI‑EAS online debugging to send a request. Example request body:

{
    "question": "请根据最新的新闻报道，分析美国科技行业目前投资分管性如何，是否存在泡沫，给出是或否的具体回答"
}

Send the request and view the generated answer.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

LLM RAG Milvus vector search DeepSeek AI Tutorial PAI LangStudio

Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.