12 min read

Build a RAG-Powered Q&A App with Alibaba Cloud Milvus, DashScope & PAI

This guide walks you through creating a Retrieval‑Augmented Generation (RAG) question‑answering application by integrating Alibaba Cloud Milvus vector search, DashScope embedding models, and PAI EAS LLM services, covering prerequisites, service deployment, configuration, Python code setup, and execution steps.

Alibaba Cloud Big Data AI Platform

Dec 16, 2024

Build a RAG-Powered Q&A App with Alibaba Cloud Milvus, DashScope & PAI

Background

Alibaba Cloud Milvus vector search service (Milvus edition) is a fully managed cloud service fully compatible with open‑source Milvus, offering scalable AI vector similarity search with easy‑to‑use, secure, low‑cost and ecosystem advantages. It supports multimodal search, RAG, recommendation, content risk detection, and more.

Prerequisites

Milvus instance created.

PAI (EAS) service enabled and default workspace created.

DashScope service enabled and API‑KEY obtained.

Usage Limits

Milvus instance and PAI (EAS) must be in the same region.

Python 3.8+ required for DashScope.

Solution Architecture

The workflow includes knowledge‑base preprocessing (text splitting), storing embeddings in Milvus, vector similarity retrieval, and RAG dialogue verification using LangChain and a PAI LLM.

Operation Steps

Step 1: Deploy the dialogue model service

Navigate to Model Deployment > Model Online Service (EAS) .

Click “Deploy Service”.

Select the “RAG Dialogue System” model and configure basic parameters (service name, model source, model type, instance count, GPU resources, etc.).

Configure Milvus connection details (version, address, proxy port, account, password, database and collection names, VPC, subnet, security group).

Click “Deploy”. When the service status becomes “Running”, the deployment is successful.

Obtain the VPC address and token from the service overview page.

Step 2: Create and run the Python script

(Optional) Create an ECS instance with public network for running the script, or run locally.

Install dependencies:

pip3 install pymilvus langchain dashscope beautifulsoup4

Create milvusr-llm.py with the following content:

from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores.milvus import Milvus
from langchain.schema.runnable import RunnablePassthrough
from langchain.prompts import PromptTemplate
from langchain_community.embeddings import DashScopeEmbeddings
from langchain_community.llms.pai_eas_endpoint import PaiEasEndpoint

COLLECTION_NAME = 'doc_qa_db'
DIMENSION = 768

loader = WebBaseLoader([
    'https://milvus.io/docs/overview.md',
    'https://milvus.io/docs/release_notes.md',
    # ... other Milvus docs URLs
])
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1024, chunk_overlap=0)
all_splits = text_splitter.split_documents(docs)

embeddings = DashScopeEmbeddings(model="text-embedding-v2", dashscope_api_key="your_api_key")
connection_args = {"host":"c-xxxx.milvus.aliyuncs.com","port":"19530","user":"your_user","password":"your_password"}
vector_store = Milvus(embedding_function=embeddings, connection_args=connection_args,
                      collection_name=COLLECTION_NAME, drop_old=True).from_documents(
    all_splits, embedding=embeddings, collection_name=COLLECTION_NAME, connection_args=connection_args)

query = "What are the main components of Milvus?"
docs = vector_store.similarity_search(query)
print(len(docs))

llm = PaiEasEndpoint(eas_service_url="your_pai_eas_url", eas_service_token="your_token")
retriever = vector_store.as_retriever()
template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible.
Always say \"thanks for asking!\" at the end of the answer.
{context}
Question: {question}
Helpful Answer:"""
rag_prompt = PromptTemplate.from_template(template)
rag_chain = ({"context": retriever, "question": RunnablePassthrough()} | rag_prompt | llm)

print(rag_chain.invoke("Explain IVF_FLAT in Milvus."))

Replace placeholders (host, user, password, API‑KEY, service URL, token) with your actual values.

Run the script:

python3 milvusr-llm.py

Key Parameters

COLLECTION_NAME : Name of the Milvus collection (default “doc_qa_db”).

model : Embedding model name, e.g., “text-embedding-v2”.

dashscope_api_key : API‑KEY for DashScope.

connection_args : Host (public address), port (Proxy Port), user, and password of the Milvus instance.

eas_service_url : URL of the deployed PAI EAS service.

eas_service_token : Token for the PAI EAS service.

Result

Running the script prints the number of retrieved documents and a concise answer such as “IVF_FLAT is a type of index in Milvus that divides vector data into nlist cluster units …”.

Python LLM LangChain RAG Milvus

Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.