Build a RAG‑Powered Smart Q&A Assistant with Milvus, DeepSeek, and PAI LangStudio
This step‑by‑step guide shows how to assemble a Retrieval‑Augmented Generation (RAG) system using Alibaba Cloud Milvus vector search, the DeepSeek large language model, and PAI LangStudio, covering instance creation, data upload, model deployment, connection setup, flow design, and service invocation.
Why struggle with massive data but no real "intelligence"? This guide shows how to quickly assemble a Retrieval‑Augmented Generation (RAG) system using Alibaba Cloud Milvus vector search, DeepSeek large language model, and PAI LangStudio.
Create Milvus instance
Follow the official guide to create a Milvus instance; ensure the instance and later services are in the same region.
Upload knowledge base to OSS
Provide example corpora for finance (PDF news) and medical (CSV disease descriptions) with the following links:
Financial news PDF: https://x.sm.cn/3aqO6uZ
Disease introduction CSV: https://x.sm.cn/1k0Z4Rv
Deploy DeepSeek and Embedding models
In the PAI console, go to Quick Start > ModelGallery , select the DeepSeek‑R1‑Distill‑Qwen‑7B model and the bge‑m3 embedding model, and deploy them.
Obtain the VPC access address and token for each deployed service.
Create connections
In LangStudio, create a new LLM service connection (EAS service) and fill in base_url and api_key with the values from the deployed DeepSeek model.
Similarly create an Embedding service connection using the corresponding base_url and api_key from the embedding model.
Create vector database connection
Create a Milvus database connection. Key parameters:
uri : Milvus instance address, e.g., http://<Milvus internal address> token : login credentials in the form <username>:<password> database : database name, default is
defaultCreate offline knowledge base
Build a knowledge‑base index that parses, chunks, and vectorizes the corpus into Milvus. Detailed configuration is available in the official documentation.
Build and run the RAG application flow
In LangStudio, create a new application flow using the RAG template. Configure the following key nodes:
index_lookup : set registered_index (the created knowledge‑base index), query (user question), and top_k (number of results).
generate_answer : select the LLM service connection, set model to default, and define max_tokens (e.g., 1000).
Start the runtime, debug, and run the flow.
Debug/run the flow.
View the trace or topology via “View Trace”.
Deploy the application flow
Deploy with default settings; configure instance count (e.g., 1 for testing) and VPC to match the Milvus instance.
Call the service for conversation
After deployment, use PAI‑EAS online debugging to send a request. Example request body:
{
"question": "请根据最新的新闻报道,分析美国科技行业目前投资分管性如何,是否存在泡沫,给出是或否的具体回答"
}Send the request and view the generated answer.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
