Build a RAG-Powered Q&A App with Alibaba Cloud Milvus, DashScope & PAI
This guide walks you through creating a Retrieval‑Augmented Generation (RAG) question‑answering application by integrating Alibaba Cloud Milvus vector search, DashScope embedding models, and PAI EAS LLM services, covering prerequisites, service deployment, configuration, Python code setup, and execution steps.
Background
Alibaba Cloud Milvus vector search service (Milvus edition) is a fully managed cloud service fully compatible with open‑source Milvus, offering scalable AI vector similarity search with easy‑to‑use, secure, low‑cost and ecosystem advantages. It supports multimodal search, RAG, recommendation, content risk detection, and more.
Prerequisites
Milvus instance created.
PAI (EAS) service enabled and default workspace created.
DashScope service enabled and API‑KEY obtained.
Usage Limits
Milvus instance and PAI (EAS) must be in the same region.
Python 3.8+ required for DashScope.
Solution Architecture
The workflow includes knowledge‑base preprocessing (text splitting), storing embeddings in Milvus, vector similarity retrieval, and RAG dialogue verification using LangChain and a PAI LLM.
Operation Steps
Step 1: Deploy the dialogue model service
Log in to the PAI console.
Navigate to Model Deployment > Model Online Service (EAS) .
Click “Deploy Service”.
Select the “RAG Dialogue System” model and configure basic parameters (service name, model source, model type, instance count, GPU resources, etc.).
Configure Milvus connection details (version, address, proxy port, account, password, database and collection names, VPC, subnet, security group).
Click “Deploy”. When the service status becomes “Running”, the deployment is successful.
Obtain the VPC address and token from the service overview page.
Step 2: Create and run the Python script
(Optional) Create an ECS instance with public network for running the script, or run locally.
Install dependencies:
pip3 install pymilvus langchain dashscope beautifulsoup4Create milvusr-llm.py with the following content:
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores.milvus import Milvus
from langchain.schema.runnable import RunnablePassthrough
from langchain.prompts import PromptTemplate
from langchain_community.embeddings import DashScopeEmbeddings
from langchain_community.llms.pai_eas_endpoint import PaiEasEndpoint
COLLECTION_NAME = 'doc_qa_db'
DIMENSION = 768
loader = WebBaseLoader([
'https://milvus.io/docs/overview.md',
'https://milvus.io/docs/release_notes.md',
# ... other Milvus docs URLs
])
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1024, chunk_overlap=0)
all_splits = text_splitter.split_documents(docs)
embeddings = DashScopeEmbeddings(model="text-embedding-v2", dashscope_api_key="your_api_key")
connection_args = {"host":"c-xxxx.milvus.aliyuncs.com","port":"19530","user":"your_user","password":"your_password"}
vector_store = Milvus(embedding_function=embeddings, connection_args=connection_args,
collection_name=COLLECTION_NAME, drop_old=True).from_documents(
all_splits, embedding=embeddings, collection_name=COLLECTION_NAME, connection_args=connection_args)
query = "What are the main components of Milvus?"
docs = vector_store.similarity_search(query)
print(len(docs))
llm = PaiEasEndpoint(eas_service_url="your_pai_eas_url", eas_service_token="your_token")
retriever = vector_store.as_retriever()
template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible.
Always say \"thanks for asking!\" at the end of the answer.
{context}
Question: {question}
Helpful Answer:"""
rag_prompt = PromptTemplate.from_template(template)
rag_chain = ({"context": retriever, "question": RunnablePassthrough()} | rag_prompt | llm)
print(rag_chain.invoke("Explain IVF_FLAT in Milvus."))Replace placeholders (host, user, password, API‑KEY, service URL, token) with your actual values.
Run the script:
python3 milvusr-llm.pyKey Parameters
COLLECTION_NAME : Name of the Milvus collection (default “doc_qa_db”).
model : Embedding model name, e.g., “text-embedding-v2”.
dashscope_api_key : API‑KEY for DashScope.
connection_args : Host (public address), port (Proxy Port), user, and password of the Milvus instance.
eas_service_url : URL of the deployed PAI EAS service.
eas_service_token : Token for the PAI EAS service.
Result
Running the script prints the number of retrieved documents and a concise answer such as “IVF_FLAT is a type of index in Milvus that divides vector data into nlist cluster units …”.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
