Artificial Intelligence 34 min read

Unlocking LangChain: A Complete Guide to Building LLM Applications

This article introduces LangChain, explains its architecture and core components, and provides step‑by‑step Python examples for chat models, embeddings, prompts, indexes, chains, memory, agents, and practical use‑cases such as QA bots, web search, summarization, and persistent vector stores.

MoonWebTeam

Jul 28, 2023

Unlocking LangChain: A Complete Guide to Building LLM Applications

1. LangChain Overview

Language models, especially large language models (LLMs) like ChatGPT, are reshaping AI. LangChain is a framework built around LLMs that simplifies creating applications such as chatbots and intelligent Q&A tools.

1.1 History

LangChain was created by Harrison Chase and open‑sourced in October 2022. After gaining attention on GitHub, it quickly became a startup and secured a $20‑25 million Series A round led by Sequoia, valuing the company at $200 million.

1.2 Why It’s Popular

Both Python and Node.js versions have attracted massive interest: the Python package earned over 54 k stars in six months, while the Node.js version gathered 7 k stars in four months, making it accessible to front‑end developers who may not know Python.

LangChain addresses several real‑world LLM pain points:

Training data is often stale (e.g., up to September 2021).

Token limits prevent processing large documents such as a 300‑page PDF.

LLMs cannot fetch up‑to‑date information from the internet.

LLMs cannot directly connect to external data sources.

As a “glue” layer, LangChain dramatically speeds up development, similar to how jQuery simplified front‑end work.

1.3 LLM Application Architecture

LangChain integrates LLMs, vector databases, prompt layers, external knowledge, and tools, enabling developers to freely compose LLM‑driven applications.

2. LangChain Components

2.1 Models

2.1.1 Chat Models

LangChain provides a standard interface for chat models, which operate on message objects (AIMessage, HumanMessage, SystemMessage, ChatMessage) rather than raw text.

# Import OpenAI chat model and message types
from langchain.chat_models import ChatOpenAI
from langchain.schema import (AIMessage, HumanMessage, SystemMessage)

# Initialize chat object
chat = ChatOpenAI(openai_api_key="...")

# Send a message
chat([HumanMessage(content="Translate this sentence from English to French: I love programming.")])

Chat models support multiple messages, system prompts, and batch processing.

# Batch processing example
batch_messages = [
    [SystemMessage(content="You are a helpful assistant that translates English to French."), HumanMessage(content="I love programming.")],
    [SystemMessage(content="You are a helpful assistant that translates English to French."), HumanMessage(content="I love artificial intelligence.")]
]
result = chat.generate(batch_messages)

LangChain also offers caching (in‑memory or database) to reduce repeated API calls.

# Set up SQLite cache
import os, langchain
from langchain.cache import SQLiteCache
os.environ["OPENAI_API_KEY"] = "your_apikey"
langchain.llm_cache = SQLiteCache(database_path=".langchain.db")
llm = ChatOpenAI()
result1 = llm.predict("tell me a joke")
result2 = llm.predict("tell me a joke")

2.1.2 Embeddings

Embeddings convert text into high‑dimensional vectors for similarity search and retrieval, without requiring fine‑tuning.

# Initialize OpenAI embeddings and embed a query
import os
from langchain.embeddings.openai import OpenAIEmbeddings
os.environ["OPENAI_API_KEY"] = "your_apikey"
embeddings = OpenAIEmbeddings()
res = embeddings.embed_query("hello world")

Embedding support for Python and Node.js

2.1.3 Large Language Models

LangChain integrates many LLM providers, allowing developers to switch models easily.

2.2 Prompts

2.2.1 Prompt Templates

PromptTemplates let you dynamically construct prompts based on user input.

from langchain.llms import OpenAI

def generate_store_names(features):
    template = "I am opening a new store with features {}. Suggest 10 names."
    prompt = template.format(features)
    llm = OpenAI()
    response = llm.generate(prompt, max_tokens=10, temperature=0.8)
    return [gen[0].text.strip() for gen in response.generations]

store_names = generate_store_names("fashion, creative, unique")
print(store_names)

2.2.2 Few‑Shot Examples

Few‑shot prompting provides example pairs to guide the model.

from langchain import PromptTemplate, FewShotPromptTemplate
from langchain.llms import OpenAI

examples = [{"word": "黑", "antonym": "白"}, {"word": "伤心", "antonym": "开心"}]
example_template = """Word: {word}
Antonym: {antonym}
"""
example_prompt = PromptTemplate(input_variables=["word", "antonym"], template=example_template)
few_shot = FewShotPromptTemplate(
    examples=examples,
    example_prompt=example_prompt,
    prefix="Provide the antonym for each word.",
    suffix="Word: {input}
Antonym:",
    input_variables=["input"],
    example_separator="
"
)
prompt = few_shot.format(input="粗")
llm = OpenAI(temperature=0.9)
print(llm(prompt))

2.2.3 Example Selectors

When many examples exist, ExampleSelector chooses the most informative subset, e.g., based on length.

from langchain.prompts import PromptTemplate, FewShotPromptTemplate
from langchain.prompts.example_selector import LengthBasedExampleSelector

examples = [{"word": "happy", "antonym": "sad"}, {"word": "tall", "antonym": "short"}, ...]
example_prompt = PromptTemplate(input_variables=["word", "antonym"], template="Word: {word}
Antonym: {antonym}
")
selector = LengthBasedExampleSelector(examples=examples, example_prompt=example_prompt, max_length=25)
dynamic_prompt = FewShotPromptTemplate(
    example_selector=selector,
    example_prompt=example_prompt,
    prefix="Give the antonym of every input",
    suffix="Word: {input}
Antonym:",
    input_variables=["input"],
    example_separator="

"
)
print(dynamic_prompt.format(input="big and huge and massive"))

2.3 Indexes

Indexes structure documents for LLM interaction. Key sub‑components include:

Document Loaders – convert CSV, JSON, PDF, HTML, etc., into text.

Text Splitters – break long texts into manageable chunks (CharacterTextSplitter, RecursiveCharacterTextSplitter, MarkdownTextSplitter, etc.).

Vector Stores – store embeddings for similarity search (FAISS, Milvus, Pinecone, Chroma, AnalyticDB, Annoy, etc.).

Retrievers – expose a get_relevant_texts method to fetch relevant documents.

2.4 Chains

2.4.1 LLMChain

Wraps an LLM with a prompt template to produce a single output.

from langchain import PromptTemplate, OpenAI, LLMChain
prompt = "What is a good name for a company that makes {product}?"
llm = OpenAI(temperature=0)
chain = LLMChain(llm=llm, prompt=PromptTemplate.from_template(prompt))
print(chain.run({"product": "colorful socks"}))

2.4.2 SimpleSequentialChain

Chains multiple LLMChains sequentially, passing the output of one as the input of the next.

from langchain import OpenAI, PromptTemplate, LLMChain, SimpleSequentialChain
# First chain – generate synopsis
synopsis_prompt = PromptTemplate.from_template("You are a playwright. Title: {title}
Synopsis:")
synopsis_chain = LLMChain(llm=OpenAI(temperature=0.7), prompt=synopsis_prompt)
# Second chain – write review
review_prompt = PromptTemplate.from_template("You are a NYT critic. Play synopsis: {synopsis}
Review:")
review_chain = LLMChain(llm=OpenAI(temperature=0.7), prompt=review_prompt)
overall = SimpleSequentialChain(chains=[synopsis_chain, review_chain], verbose=True)
print(overall.run("Tragedy at sunset on the beach"))

2.4.3 SequentialChain

Allows multiple inputs/outputs with explicit variable naming.

# Synopsis chain with explicit output key
synopsis_chain = LLMChain(llm=OpenAI(temperature=0.7), prompt=PromptTemplate(...), output_key="synopsis")
# Review chain consumes "synopsis"
review_chain = LLMChain(llm=OpenAI(temperature=0.7), prompt=PromptTemplate(...), output_key="review")
overall = SequentialChain(chains=[synopsis_chain, review_chain])
print(overall({"title": "Tragedy at sunset on the beach", "era": "Victorian England"}))

2.4.4 TransformChain

Applies a custom Python function to transform inputs before passing them to another chain.

def shorten(inputs: dict) -> dict:
    text = inputs["text"]
    short = "

".join(text.split("

")[:3])
    return {"output_text": short}
transform = TransformChain(input_variables=["text"], output_variables=["output_text"], transform=shorten)
prompt = PromptTemplate(input_variables=["output_text"], template="Summarize this text:
{output_text}
Summary:")
llm_chain = LLMChain(llm=OpenAI(), prompt=prompt)
seq = SimpleSequentialChain(chains=[transform, llm_chain])
print(seq.run(long_document))

2.5 Memory

ConversationBufferMemory – stores the full chat history in memory.

ConversationBufferWindowMemory – keeps only the last *k* turns.

ConversationTokenBufferMemory – evicts based on token count.

ConversationSummaryMemory – stores a summary of the conversation.

ConversationSummaryBufferMemory – combines summary with token‑based eviction.

VectorStoreRetrieverMemory – saves past turns in a vector DB and retrieves the most similar ones.

2.6 Agents

Agents decide which tools or chains to invoke based on user input.

2.6.1 Action Agents

At each step, the agent uses the outputs of previous actions to choose the next action.

2.6.2 Plan‑and‑Execute Agents

The agent first plans the entire sequence of steps and then executes them without replanning.

Receive user input.

Plan the full order of actions.

Execute actions sequentially, passing each output to the next.

3. LangChain Practical Projects

3.1 Simple Q&A

import os
os.environ["OPENAI_API_KEY"] = "your_api_key"
from langchain.llms import OpenAI
llm = OpenAI(model_name="text-davinci-003", max_tokens=1024)
print(llm("How do you evaluate artificial intelligence?"))

3.2 Google Search via SerpAPI

import os
os.environ["OPENAI_API_KEY"] = "your_api_key"
os.environ["SERPAPI_API_KEY"] = "your_api_key"
from langchain.agents import load_tools, initialize_agent, AgentType
from langchain.llms import OpenAI
llm = OpenAI(temperature=0)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
print(agent.run("What is the date today? What happened on this day in history?"))

3.3 Summarizing Long Texts

import os
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains.summarize import load_summarize_chain
from langchain import OpenAI
os.environ["OPENAI_API_KEY"] = "your_api_key"
loader = TextLoader("static/open.txt")
documents = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=0)
chunks = splitter.split_documents(documents)
llm = OpenAI(model_name="text-davinci-003", max_tokens=1500)
chain = load_summarize_chain(llm, chain_type="refine", verbose=True)
print(chain.run(chunks))

3.4 Building a Local Knowledge‑Base QA Bot

import os
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain import OpenAI, VectorDBQA
from langchain.document_loaders import DirectoryLoader
os.environ["OPENAI_API_KEY"] = "your_api_key"
loader = DirectoryLoader("static", glob="**/*.txt")
documents = loader.load()
splitter = CharacterTextSplitter(chunk_size=100, chunk_overlap=0)
split_docs = splitter.split_documents(documents)
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(split_docs, embeddings)
qa = VectorDBQA.from_chain_type(llm=OpenAI(), chain_type="stuff", vectorstore=vectorstore, return_source_documents=True)
print(qa({"query": "What is the annual revenue?"}))

3.5 Persistent Vector Store with Chroma

from langchain.vectorstores import Chroma
# Persist embeddings
vectorstore = Chroma.from_documents(documents, embeddings, persist_directory="D:/vector_store")
vectorstore.persist()
# Load later
vectorstore = Chroma(persist_directory="D:/vector_store", embedding_function=embeddings)

3.6 Open‑Source Projects Based on LangChain

Curated list of notable LangChain projects (e.g., LangChain‑ChatGLM‑Webui).

4. Conclusion

LangChain continues to evolve, expanding its capabilities for complex LLM workflows and real‑world problem solving. While the guide may not cover every nuance, it provides a solid foundation for developers to experiment and build powerful AI‑driven applications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python LLM LangChain Vector Store

Written by

MoonWebTeam

Official account of MoonWebTeam. All members are former front‑end engineers from Tencent, and the account shares valuable team tech insights, reflections, and other information.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.