Building Java LLM Applications with LangChain4j: A Hands‑On Guide
This tutorial walks through the fundamentals of large language models, prompt engineering, and word embeddings, then shows how to set up a LangChain‑based LLM stack in Java using LangChain4j, covering core modules, memory, retrieval, chains, agents, and complete code examples.
Introduction
This guide demonstrates how to build Java applications that leverage large language models (LLMs) using the LangChain framework (specifically the community‑maintained LangChain4j library). It covers the essential concepts of LLMs, prompt engineering, embeddings, and the core LangChain modules such as model I/O, memory, retrieval, chains, and agents.
Background
Large Language Models
LLMs are probabilistic neural networks trained on massive unlabelled text corpora via self‑supervised learning. After pre‑training, they can be fine‑tuned or prompted for downstream tasks such as translation, summarisation, or question answering.
Prompt Engineering
Prompt engineering structures textual instructions (including few‑shot examples) to steer an LLM toward a desired behaviour. Techniques like chain‑of‑thought prompting improve reasoning and safety.
Word Embeddings
Word embeddings map tokens to dense real‑valued vectors (e.g., Word2Vec, GloVe). Embeddings enable semantic search and retrieval‑augmented generation (RAG) by providing a vector representation of text.
LangChain for Java (LangChain4j)
LangChain4j is an unofficial Java port of LangChain, compatible with Java 8+ and Spring Boot 2/3. Add the library from Maven Central:
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j</artifactId>
<version>0.23.0</version>
</dependency>Core LangChain Modules
Model I/O
Prompt templates generate concrete prompts, and output parsers extract structured data from model responses.
PromptTemplate promptTemplate = PromptTemplate.from("Tell me a {{adjective}} joke about {{content}}..");
Map<String, Object> vars = new HashMap<>();
vars.put("adjective", "funny");
vars.put("content", "computers");
Prompt prompt = promptTemplate.apply(vars);Memory
Memory stores prior interactions so the LLM can reference earlier context. The example uses a token‑window memory implementation.
ChatMemory chatMemory = TokenWindowChatMemory.withMaxTokens(300, new OpenAiTokenizer(GPT_3_5_TURBO));
chatMemory.add(userMessage("你好,我叫 Kumar"));
AiMessage answer = model.generate(chatMemory.messages()).content();
System.out.println(answer.text()); // 你好 Kumar!今天我能为您做些什么?
chatMemory.add(answer);
chatMemory.add(userMessage("我叫什么名字?"));
AiMessage answer2 = model.generate(chatMemory.messages()).content();
System.out.println(answer2.text()); // 您的名字是 Kumar。
chatMemory.add(answer2);Retrieval (RAG)
RAG fetches relevant external documents, converts them to embeddings, and supplies the most similar chunks to the LLM.
Document doc = FileSystemDocumentLoader.loadDocument("simpson's_adventures.txt");
DocumentSplitter splitter = DocumentSplitters.recursive(100, 0, new OpenAiTokenizer(GPT_3_5_TURBO));
List<TextSegment> segments = splitter.split(doc);
EmbeddingModel embeddingModel = new AllMiniLmL6V6EmbeddingModel();
List<Embedding> embeddings = embeddingModel.embedAll(segments).content();
EmbeddingStore<TextSegment> store = new InMemoryEmbeddingStore<>();
store.addAll(embeddings, segments);
String question = "Who is Simpson?";
Embedding qEmb = embeddingModel.embed(question).content();
List<EmbeddingMatch<TextSegment>> matches = store.findRelevant(qEmb, 3, 0.7);Advanced Applications
Chains
Chains compose multiple components (retriever, memory, prompt) into a single workflow. The following builds a conversational retrieval chain.
ConversationalRetrievalChain chain = ConversationalRetrievalChain.builder()
.chatLanguageModel(chatModel)
.retriever(EmbeddingStoreRetriever.from(store, embeddingModel))
.chatMemory(MessageWindowChatMemory.withMaxMessages(10))
.promptTemplate(PromptTemplate.from(
"Answer the following question to the best of your ability: {{question}}
"
+ "Base your answer on the following information:
{{information}}"))
.build();
String answer = chain.execute("Who is Simpson?");Agents
Agents treat the LLM as a reasoning engine that can decide which tools to invoke. The example defines a simple calculator tool and wires it into an AI service.
public class AIServiceWithCalculator {
static class Calculator {
@Tool("Calculates the length of a string")
int stringLength(String s) { return s.length(); }
@Tool("Calculates the sum of two numbers")
int add(int a, int b) { return a + b; }
}
}
interface Assistant { String chat(String userMessage); }
Assistant assistant = AiServices.builder(Assistant.class)
.chatLanguageModel(OpenAiChatModel.withApiKey("YOUR_OPENAI_API_KEY"))
.tools(new Calculator())
.chatMemory(MessageWindowChatMemory.withMaxMessages(10))
.build();
String question = "What is the sum of the numbers of letters in the words \"language\" and \"model\"?";
String answer = assistant.chat(question);
System.out.println(answer); // The sum ... is 13.Note: LLMs may struggle with complex arithmetic or temporal reasoning; providing explicit tools mitigates these limitations.
Conclusion
The tutorial covered the fundamental building blocks for Java‑based LLM applications: prompt templating, memory management, retrieval‑augmented generation, composable chains, and tool‑enabled agents. By integrating these components through LangChain4j, developers can construct robust, maintainable AI‑driven software.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect's Guide
Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
