Stop Memorizing Docs: Build a Spring AI RAG System That Instantly Understands Business
This article walks through creating a Retrieval‑Augmented Generation (RAG) powered Q&A service in Java using Spring AI, covering the rationale for choosing Spring AI over LangChain, required environment, Maven setup, configuration, document ingestion, Advisor‑based query handling, testing, and practical limitations of RAG implementations.
Why Spring AI fits Java better than LangChain
LangChain in Python is flexible but in Java it often requires manual chain assembly, has unclear lifecycle and configuration, and breaks Spring Boot automation. Spring AI follows Spring design principles: starter‑based auto‑configuration, POJO + Builder APIs, high‑level abstractions (Advisor, VectorStore), enterprise capabilities (observability, security, consistent configuration), and free swapping of model and vector store.
Spring AI design philosophy
Integration experience: starter + auto‑configuration, ready‑to‑use.
API style: POJO + Builder, matches Java intuition.
Abstraction level: Advisor / VectorStore high‑level wrappers.
Enterprise capabilities: observability, security, configuration consistency.
Portability: model and vector store can be replaced freely.
RAG definition and Spring AI implementation
Retrieval‑Augmented Generation lets an LLM “look at” business data before answering. Standard workflow: Retrieve → Augment → Generate.
Spring AI’s Advisor API wraps these three steps, eliminating manual retrieval and prompt construction.
User Question
↓
QuestionAnswerAdvisor
↓
VectorStore retrieval
↓
Prompt auto‑augmentation
↓
LLM generates answerEnvironment and prerequisites
Java 21+
Spring Boot
Maven
OpenAI API key
PostgreSQL with pgvector extension
Project initialization (Maven)
Add Spring AI BOM:
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-bom</artifactId>
<version>1.0.0</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>Core dependencies:
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- OpenAI support -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
<!-- PGVector vector store -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
</dependency>
<!-- Document reader -->
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-pdf-document-reader</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<scope>runtime</scope>
</dependency>
</dependencies>Configuration (application.yml)
spring:
ai:
openai:
api-key: ${OPENAI_API_KEY}
chat:
model: gpt-4o-mini
embedding:
model: text-embedding-3-small
datasource:
url: jdbc:postgresql://localhost:5432/ragdb
username: user
password: password
jpa:
hibernate:
ddl-auto: update
ai:
vectorstore:
pgvector:
initialize-schema: trueDocument ingestion (ETL pipeline)
Directory layout:
src/main/java/com/icoderoad/rag
src/main/resources/data/spring-ai-info.txtIngestion service reads the text file, splits it into token chunks, and stores the chunks in the vector store:
package com.icoderoad.rag.service;
import lombok.extern.slf4j.Slf4j;
import org.springframework.ai.document.Document;
import org.springframework.ai.reader.TextReader;
import org.springframework.ai.transformer.splitter.TokenTextSplitter;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.boot.CommandLineRunner;
import org.springframework.core.io.Resource;
import org.springframework.stereotype.Service;
import java.util.List;
@Slf4j
@Service
public class IngestionService implements CommandLineRunner {
private final VectorStore vectorStore;
@Value("classpath:/data/spring-ai-info.txt")
private Resource source;
public IngestionService(VectorStore vectorStore) {
this.vectorStore = vectorStore;
}
@Override
public void run(String... args) {
log.info(">>> Starting RAG document ingestion");
TextReader reader = new TextReader(source);
List<Document> documents = reader.get();
TokenTextSplitter splitter = new TokenTextSplitter();
List<Document> chunks = splitter.apply(documents);
vectorStore.accept(chunks);
log.info(">>> Ingestion completed, {} chunks stored", chunks.size());
}
}RAG query endpoint (Advisor core)
package com.icoderoad.rag.controller;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.client.advisor.QuestionAnswerAdvisor;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
@RestController
public class RagController {
private final ChatClient chatClient;
public RagController(ChatClient.Builder builder, VectorStore vectorStore) {
this.chatClient = builder
.defaultAdvisors(QuestionAnswerAdvisor.builder(vectorStore).build())
.build();
}
@GetMapping("/rag/query")
public String query(@RequestParam(defaultValue = "What is Spring AI?") String question) {
return chatClient.prompt()
.user(question)
.call()
.content();
}
}Running and testing
Execute:
curl "http://localhost:8080/rag/query?query=What is the primary goal of Spring AI?"The answer is limited to the content of the ingested documents; the model does not generate information beyond that.
Practical limitations and engineering considerations
Low‑quality documents produce poor answers (garbage‑in‑garbage‑out).
Improper chunking or confusing business context reduces relevance.
Even strong LLMs cannot compensate for bad data; ETL and data governance are required.
Vector retrieval may need re‑ranking, multi‑query expansion, or hybrid vector‑keyword search in production.
Spring AI supplies building blocks; overall architecture must still be designed.
Advisor handles simple Q&A efficiently (≈80 % of cases); complex multi‑step reasoning may require custom agents.
80 % of scenarios use Advisor, 20 % require hand‑crafted pipelines.
Final takeaways
RAG is a practical technology today.
Spring AI provides the most natural RAG solution for the Java ecosystem.
The main bottleneck is data quality and engineering effort, not the model itself.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
LuTiao Programming
LuTiao Programming is a friendly community offering free programming lessons. We inspire learners to explore new ideas and technologies and quickly acquire job-ready skills.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
