How to Build a Retrieval‑Augmented Generation (RAG) System with Alibaba Cloud Milvus and PAI

This guide walks you through setting up Alibaba Cloud Milvus, configuring public access, deploying a RAG system via PAI, uploading a knowledge base, interacting with the model through the Web UI, and inspecting vector collections with Attu, all with step‑by‑step instructions and configuration details.

Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
How to Build a Retrieval‑Augmented Generation (RAG) System with Alibaba Cloud Milvus and PAI

Background

Alibaba Cloud Milvus (vector retrieval service) is a fully managed, 100% compatible version of open‑source Milvus that adds scalability, ease of use, security, low cost, and ecosystem integration, making it ideal for large‑scale AI vector similarity search in scenarios such as multimodal search, RAG, recommendation, and content risk detection.

Prerequisites

A Milvus instance with public network access enabled.

An activated PAI (EAS) workspace and default workspace created.

Usage Limits

Milvus instances and PAI (EAS) must reside in the same region.

Operation Process

Step 1: Deploy RAG System via PAI

Log in to the PAI console, select the workspace, and navigate to Model Deployment → Model Online Service (EAS).

Click “Deploy Service” and choose the large‑model RAG dialogue system.

Configure key parameters (service name, model source, model type, instance count, GPU resources, Milvus connection details, VPC, subnet, security group). Use default values where appropriate.

Parameter

Description

Service Name

Customizable name.

Model Source

Default open‑source public model.

Model Type

Example uses Qwen1.5‑1.8b.

Instance Count

Default 1.

GPU Resource

Select as needed, e.g., ml.gu7i.c16m30.1‑gu30.

Milvus Version

Choose Milvus.

Milvus Address

Internal address of the Milvus instance.

Proxy Port

Proxy port of the Milvus instance.

Account

Set to root.

Password

Root password set during Milvus creation.

Database Name

Usually "default"; can create others.

Collection Name

New or existing collection matching RAG requirements.

Step 2: Upload Knowledge Base via RAG WebUI

Open the WebUI from the Model Online Service page and configure the embedding model (model name and dimension are auto‑filled).

Test the Milvus connection by clicking “Connect Milvus”.

In the Upload tab, set semantic chunk parameters:

Parameter

Description

Chunk Size

Size of each chunk in bytes (default 500).

Chunk Overlap

Overlap between adjacent chunks (default 10).

Process with QA Extraction Model

Enable to automatically extract QA pairs from uploaded documents.

Upload files (e.g., poems.txt) in the Files tab, then click Upload. The system cleans the data, performs semantic chunking, and stores it.

Step 3: Interact with the RAG System via WebUI Chat

Select a Prompt strategy in the Chat tab: LLM only, Retrieval only, or combined Retrieval‑plus‑LLM (RAG). Choose the desired LLM and start a conversation to receive model answers.

Step 4: View Knowledge Base Chunks with Attu

Use the Attu graphical tool (open via the Milvus console) to inspect the automatically created collection, view stored vectors, and verify chunking.

Related Information

For more details about Milvus, see the official documentation at https://x.sm.cn/FNXU3m7.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AIRAGMilvusTutorialAlibaba CloudPAI
Alibaba Cloud Big Data AI Platform
Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.