Artificial Intelligence 11 min read

Build a RAG-Powered Knowledge Base with Spring Boot, Milvus, and Ollama

This guide walks through creating a Retrieval‑Augmented Generation (RAG) system using Spring Boot 3.4.2, Milvus vector database, and the bge‑m3 embedding model via Ollama, covering environment setup, dependency configuration, vector store operations, and integration with a large language model to deliver refined, similarity‑based answers.

Spring Full-Stack Practical Cases

Apr 10, 2025

Build a RAG-Powered Knowledge Base with Spring Boot, Milvus, and Ollama

1. Introduction

1.1 What is RAG?

Retrieval‑Augmented Generation (RAG) combines a large language model (LLM) with an external knowledge base to improve the accuracy and relevance of generated text.

RAG enables the model to retrieve relevant documents from a set of files and incorporate that information into its responses, rather than relying solely on its pre‑trained knowledge.

1.2 What is a vector database?

A vector database stores embeddings (numeric vectors) and performs similarity search instead of exact matching. It allows you to find items with similar semantic meaning, such as images or sentences.

Example: the text "I love Spring full‑stack case source code" is converted by an embedding model into a vector like [0.24, -0.56, 0.89].

1.3 Milvus Overview

Milvus is a popular open‑source vector database. For this tutorial we only need to know how to use it; detailed documentation is available at https://milvus.io/docs/zh .

2. Practical Example

2.1 Environment Preparation

Install Milvus (standalone) using the provided script.

Install the bge‑m3 embedding model with ollama pull bge-m3:latest.

# Download script
$ curl -sfL https://raw.githubusercontent.com/milvus-io/milvus/master/scripts/standalone_embed.sh -o standalone_embed.sh
# Start container
$ bash standalone_embed.sh start

2.2 Project Configuration

Add the following Maven dependencies:

<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-milvus-store-spring-boot-starter</artifactId>
</dependency>
<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-ollama-spring-boot-starter</artifactId>
</dependency>
<dependency>
  <groupId>com.alibaba.cloud.ai</groupId>
  <artifactId>spring-ai-alibaba-starter</artifactId>
  <version>1.0.0-M6.1</version>
</dependency>

Key configuration (YAML style):

spring:
  ai:
    dashscope:
      api-key: sk-xxxooo
      base-url: https://dashscope.aliyuncs.com/compatible-mode/v1
      chat:
        options:
          model: qwen-turbo
      embedding:
        enabled: false
---
spring:
  ai:
    ollama:
      chat:
        enabled: false
      base-url: http://localhost:11111
      embedding:
        enabled: true
        model: bge-m3:latest
---
spring:
  ai:
    vectorstore:
      milvus:
        client:
          host: localhost
          port: 19530
          username: root
          password: root
        initialize-schema: true
        embeddingDimension: 1024

2.3 Vector Store Operations

Service to save documents and perform similarity search:

@Service
public class DocumentService {
  private final VectorStore vectorStore;
  public DocumentService(VectorStore vectorStore) {
    this.vectorStore = vectorStore;
  }
  // Save sample texts
  public void save() {
    List<Document> documents = List.of(
        new Document("banana"),
        new Document("apple"),
        new Document("orange"),
        new Document("strawberry"),
        new Document("Java"),
        new Document("python"),
        new Document("C#"),
        new Document("tiger"));
    this.vectorStore.add(documents);
  }
  // Similarity search
  public List<Document> query(String prompt, int topK) {
    SearchRequest request = SearchRequest.builder()
        .query(prompt)
        .topK(topK)
        .build();
    return this.vectorStore.similaritySearch(request);
  }
}

Controller exposing endpoints:

@RestController
@RequestMapping("/rag")
public class RagController {
  private final DocumentService documentService;
  public RagController(DocumentService documentService) {
    this.documentService = documentService;
  }
  @GetMapping("/save")
  public ResponseEntity<String> save() {
    this.documentService.save();
    return ResponseEntity.ok("success");
  }
  @GetMapping("/{topK}")
  public ResponseEntity<List<Document>> query(@PathVariable Integer topK, String prompt) {
    return ResponseEntity.ok(this.documentService.query(prompt, topK));
  }
}

2.4 Combine with LLM

Configure a ChatClient bean:

@Configuration
public class ChatConfig {
  @Bean
  ChatClient chatClient(ChatClient.Builder builder) {
    return builder.defaultAdvisors(List.of(new SimpleLoggerAdvisor()))
                  .build();
  }
}

Endpoint that retrieves relevant documents, builds a prompt, and calls the LLM:

@GetMapping("/query/{topK}")
public ResponseEntity<String> queryLLM(@PathVariable Integer topK,
                                        @RequestParam String prompt) {
  SearchRequest request = SearchRequest.builder()
      .query(prompt)
      .topK(topK)
      .build();
  List<Document> docs = this.vectorStore.similaritySearch(request);
  PromptTemplate template = new PromptTemplate("{userMessage}

 Use the following information to answer the question:
 {contents}");
  Prompt finalPrompt = template.create(Map.of("userMessage", prompt, "contents", docs));
  String result = this.chatClient.prompt(finalPrompt).call().content();
  return ResponseEntity.ok(result);
}

Running /rag/save stores the sample texts in Milvus; /rag/{topK}?prompt=... performs similarity search; the final LLM step filters and formats the answer.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

LLM RAG vector database Spring Boot Milvus Embedding

Written by

Spring Full-Stack Practical Cases

Full-stack Java development with Vue 2/3 front-end suite; hands-on examples and source code analysis for Spring, Spring Boot 2/3, and Spring Cloud.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.