Boost Your Java Apps with LangChain4j: A Hands‑On RAG Guide
This article walks Java developers through the fundamentals of Retrieval‑Augmented Generation (RAG), explains the LangChain4j framework, compares large‑model development with traditional Java coding, and provides step‑by‑step code examples for environment setup, document splitting, embedding, vector‑store operations, and LLM interaction.
Introduction
ChatGPT and other large language models (LLMs) are pre‑trained and cannot incorporate the latest data without additional techniques. Retrieval‑Augmented Generation (RAG) enables up‑to‑date answers by retrieving relevant private or recent documents before generation.
What is RAG
RAG combines traditional information retrieval (IR) with generative LLMs. The workflow consists of four steps: receive a user request, retrieve relevant document fragments, augment the original query with these fragments, and let the LLM generate a final answer.
Retrieval can use relational databases, full‑text search engines, or vector databases; vector stores are preferred because they support similarity search rather than exact keyword matching.
LangChain4j Overview
LangChain4j is the Java implementation of the LangChain framework. It abstracts LLM interaction, prompt handling, document splitting, embedding, and vector‑store management, allowing developers to focus on business logic.
Large‑Model Development vs. Traditional Java Development
In large‑model development, the focus is on data preparation, model selection/fine‑tuning, prompt engineering, and integrating LLMs into existing systems. Traditional Java development emphasizes system architecture, modular design, and algorithm implementation.
Practical Experience
Environment Setup
Windows : Install Python, verify with python --version, then install Chroma (vector store) and verify with chroma run.
macOS : Install Python via Homebrew ( brew install python) or download from python.org, then verify and install Chroma similarly.
Integrating LangChain4j
<properties>
<langchain4j.version>0.31.0</langchain4j.version>
</properties>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-core</artifactId>
<version>${langchain4j.version}</version>
</dependency>
... (additional dependencies for OpenAI, embeddings, Chroma, etc.)Project Structure
LangChain
├── core
│ ├── src/main/java/cn/jdl/tech_and_data/ka
│ │ ├── ChatWithMemory
│ │ ├── Constants
│ │ ├── Main
│ │ ├── RagChat
│ │ └── Utils
│ └── resources
│ ├── log4j2.xml
│ └── 笑话.txt
├── pom.xml
└── parent [learn.langchain.parent]Knowledge Acquisition
Load a local text file (e.g., 笑话.txt) as the knowledge base:
URL docUrl = Main.class.getClassLoader().getResource("笑话.txt");
Document document = getDocument(docUrl);Document Splitting
Use DocumentSplitters.recursive(150, 10, new OpenAiTokenizer()) to split the document into overlapping segments of up to 150 tokens.
Tokens are the basic units after tokenization (e.g., BPE for GPT‑4o). Splitting is necessary because LLMs have input‑length limits.
Embedding
Create an OpenAI embedding model (text‑embedding‑ada‑002, 1536‑dimensional) and embed each segment:
OpenAiEmbeddingModel embeddingModel = new OpenAiEmbeddingModel.OpenAiEmbeddingModelBuilder()
.apiKey(API_KEY)
.baseUrl(BASE_URL)
.build();
Embedding embedding = embeddingModel.embed(text).content();Vector Store Storage
Start a Chroma instance, create a collection, and store each TextSegment together with its embedding:
Client client = new Client(CHROMA_URL);
EmbeddingFunction embeddingFunction = new OpenAIEmbeddingFunction(API_KEY, OPEN_AI_MODULE_NAME);
client.createCollection(CHROMA_DB_DEFAULT_COLLECTION_NAME, null, true, embeddingFunction);
EmbeddingStore<TextSegment> store = ChromaEmbeddingStore.builder()
.baseUrl(CHROMA_URL)
.collectionName(CHROMA_DB_DEFAULT_COLLECTION_NAME)
.build();
segments.forEach(s -> {
Embedding e = embeddingModel.embed(s).content();
store.add(e, s);
});Vector Store Retrieval
Embed the user query, then search the collection for the most similar segment:
Embedding queryEmbedding = embeddingModel.embed(queryText).content();
EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
.queryEmbedding(queryEmbedding)
.maxResults(1)
.build();
EmbeddingSearchResult<TextSegment> result = store.search(request);
TextSegment matched = result.matches().get(0).embedded();LLM Interaction
Build a prompt that injects the retrieved context and the original question, then call the OpenAI chat model:
PromptTemplate template = PromptTemplate.from(
"Based on the following information answer the question:
{{context}}
Question:
{{question}}");
Map<String, Object> vars = new HashMap<>();
vars.put("context", matched.text());
vars.put("question", QUESTION);
Prompt prompt = template.apply(vars);
UserMessage userMessage = prompt.toUserMessage();
OpenAiChatModel chatModel = OpenAiChatModel.builder()
.apiKey(API_KEY)
.baseUrl(BASE_URL)
.modelName(OPEN_AI_MODULE_NAME)
.temperature(0)
.build();
Response<AiMessage> response = chatModel.generate(userMessage);
String answer = response.content();Testing
The article demonstrates both a plain LLM call (without RAG) and a RAG‑enhanced call using the ice‑cream joke dataset, showing how the retrieved segment improves answer relevance.
Conclusion and Outlook
The hands‑on example illustrates the complete pipeline—from environment preparation to vector‑store retrieval and LLM generation—using LangChain4j for RAG. Continued exploration will unlock more sophisticated applications of RAG in enterprise scenarios.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JD Cloud Developers
JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
