Artificial Intelligence 15 min read

Building Java LLM Applications with LangChain4j: A Hands‑On Guide

This tutorial walks through the fundamentals of large language models, prompt engineering, and word embeddings, then shows how to set up a LangChain‑based LLM stack in Java using LangChain4j, covering core modules, memory, retrieval, chains, agents, and complete code examples.

Architect's Guide

Nov 24, 2025

Building Java LLM Applications with LangChain4j: A Hands‑On Guide

Introduction

This guide demonstrates how to build Java applications that leverage large language models (LLMs) using the LangChain framework (specifically the community‑maintained LangChain4j library). It covers the essential concepts of LLMs, prompt engineering, embeddings, and the core LangChain modules such as model I/O, memory, retrieval, chains, and agents.

Background

Large Language Models

LLMs are probabilistic neural networks trained on massive unlabelled text corpora via self‑supervised learning. After pre‑training, they can be fine‑tuned or prompted for downstream tasks such as translation, summarisation, or question answering.

Prompt Engineering

Prompt engineering structures textual instructions (including few‑shot examples) to steer an LLM toward a desired behaviour. Techniques like chain‑of‑thought prompting improve reasoning and safety.

Word Embeddings

Word embeddings map tokens to dense real‑valued vectors (e.g., Word2Vec, GloVe). Embeddings enable semantic search and retrieval‑augmented generation (RAG) by providing a vector representation of text.

LangChain for Java (LangChain4j)

LangChain4j is an unofficial Java port of LangChain, compatible with Java 8+ and Spring Boot 2/3. Add the library from Maven Central:

<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j</artifactId>
    <version>0.23.0</version>
</dependency>

Core LangChain Modules

Model I/O

Prompt templates generate concrete prompts, and output parsers extract structured data from model responses.

PromptTemplate promptTemplate = PromptTemplate.from("Tell me a {{adjective}} joke about {{content}}..");
Map<String, Object> vars = new HashMap<>();
vars.put("adjective", "funny");
vars.put("content", "computers");
Prompt prompt = promptTemplate.apply(vars);

Memory

Memory stores prior interactions so the LLM can reference earlier context. The example uses a token‑window memory implementation.

ChatMemory chatMemory = TokenWindowChatMemory.withMaxTokens(300, new OpenAiTokenizer(GPT_3_5_TURBO));
chatMemory.add(userMessage("你好，我叫 Kumar"));
AiMessage answer = model.generate(chatMemory.messages()).content();
System.out.println(answer.text()); // 你好 Kumar！今天我能为您做些什么？
chatMemory.add(answer);
chatMemory.add(userMessage("我叫什么名字？"));
AiMessage answer2 = model.generate(chatMemory.messages()).content();
System.out.println(answer2.text()); // 您的名字是 Kumar。
chatMemory.add(answer2);

Retrieval (RAG)

RAG fetches relevant external documents, converts them to embeddings, and supplies the most similar chunks to the LLM.

Document doc = FileSystemDocumentLoader.loadDocument("simpson's_adventures.txt");
DocumentSplitter splitter = DocumentSplitters.recursive(100, 0, new OpenAiTokenizer(GPT_3_5_TURBO));
List<TextSegment> segments = splitter.split(doc);
EmbeddingModel embeddingModel = new AllMiniLmL6V6EmbeddingModel();
List<Embedding> embeddings = embeddingModel.embedAll(segments).content();
EmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();
store.addAll(embeddings, segments);
String question = "Who is Simpson?";
Embedding qEmb = embeddingModel.embed(question).content();
List<EmbeddingMatch<TextSegment>> matches = store.findRelevant(qEmb, 3, 0.7);

Advanced Applications

Chains

Chains compose multiple components (retriever, memory, prompt) into a single workflow. The following builds a conversational retrieval chain.

ConversationalRetrievalChain chain = ConversationalRetrievalChain.builder()
    .chatLanguageModel(chatModel)
    .retriever(EmbeddingStoreRetriever.from(store, embeddingModel))
    .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
    .promptTemplate(PromptTemplate.from(
        "Answer the following question to the best of your ability: {{question}}

"
        + "Base your answer on the following information:
{{information}}"))
    .build();
String answer = chain.execute("Who is Simpson?");

Agents

Agents treat the LLM as a reasoning engine that can decide which tools to invoke. The example defines a simple calculator tool and wires it into an AI service.

public class AIServiceWithCalculator {
    static class Calculator {
        @Tool("Calculates the length of a string")
        int stringLength(String s) { return s.length(); }
        @Tool("Calculates the sum of two numbers")
        int add(int a, int b) { return a + b; }
    }
}
interface Assistant { String chat(String userMessage); }
Assistant assistant = AiServices.builder(Assistant.class)
    .chatLanguageModel(OpenAiChatModel.withApiKey("YOUR_OPENAI_API_KEY"))
    .tools(new Calculator())
    .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
    .build();
String question = "What is the sum of the numbers of letters in the words \"language\" and \"model\"?";
String answer = assistant.chat(question);
System.out.println(answer); // The sum ... is 13.

Note: LLMs may struggle with complex arithmetic or temporal reasoning; providing explicit tools mitigates these limitations.

Conclusion

The tutorial covered the fundamental building blocks for Java‑based LLM applications: prompt templating, memory management, retrieval‑augmented generation, composable chains, and tool‑enabled agents. By integrating these components through LangChain4j, developers can construct robust, maintainable AI‑driven software.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java AI Agents LLM Prompt Engineering LangChain Retrieval Augmented Generation

Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.