Unlocking LangChain: A Complete Guide to Building LLM Applications
This article introduces LangChain, explains its architecture and core components, and provides step‑by‑step Python examples for chat models, embeddings, prompts, indexes, chains, memory, agents, and practical use‑cases such as QA bots, web search, summarization, and persistent vector stores.
1. LangChain Overview
Language models, especially large language models (LLMs) like ChatGPT, are reshaping AI. LangChain is a framework built around LLMs that simplifies creating applications such as chatbots and intelligent Q&A tools.
1.1 History
LangChain was created by Harrison Chase and open‑sourced in October 2022. After gaining attention on GitHub, it quickly became a startup and secured a $20‑25 million Series A round led by Sequoia, valuing the company at $200 million.
1.2 Why It’s Popular
Both Python and Node.js versions have attracted massive interest: the Python package earned over 54 k stars in six months, while the Node.js version gathered 7 k stars in four months, making it accessible to front‑end developers who may not know Python.
LangChain addresses several real‑world LLM pain points:
Training data is often stale (e.g., up to September 2021).
Token limits prevent processing large documents such as a 300‑page PDF.
LLMs cannot fetch up‑to‑date information from the internet.
LLMs cannot directly connect to external data sources.
As a “glue” layer, LangChain dramatically speeds up development, similar to how jQuery simplified front‑end work.
1.3 LLM Application Architecture
LangChain integrates LLMs, vector databases, prompt layers, external knowledge, and tools, enabling developers to freely compose LLM‑driven applications.
2. LangChain Components
2.1 Models
2.1.1 Chat Models
LangChain provides a standard interface for chat models, which operate on message objects (AIMessage, HumanMessage, SystemMessage, ChatMessage) rather than raw text.
# Import OpenAI chat model and message types
from langchain.chat_models import ChatOpenAI
from langchain.schema import (AIMessage, HumanMessage, SystemMessage)
# Initialize chat object
chat = ChatOpenAI(openai_api_key="...")
# Send a message
chat([HumanMessage(content="Translate this sentence from English to French: I love programming.")])Chat models support multiple messages, system prompts, and batch processing.
# Batch processing example
batch_messages = [
[SystemMessage(content="You are a helpful assistant that translates English to French."), HumanMessage(content="I love programming.")],
[SystemMessage(content="You are a helpful assistant that translates English to French."), HumanMessage(content="I love artificial intelligence.")]
]
result = chat.generate(batch_messages)LangChain also offers caching (in‑memory or database) to reduce repeated API calls.
# Set up SQLite cache
import os, langchain
from langchain.cache import SQLiteCache
os.environ["OPENAI_API_KEY"] = "your_apikey"
langchain.llm_cache = SQLiteCache(database_path=".langchain.db")
llm = ChatOpenAI()
result1 = llm.predict("tell me a joke")
result2 = llm.predict("tell me a joke")2.1.2 Embeddings
Embeddings convert text into high‑dimensional vectors for similarity search and retrieval, without requiring fine‑tuning.
# Initialize OpenAI embeddings and embed a query
import os
from langchain.embeddings.openai import OpenAIEmbeddings
os.environ["OPENAI_API_KEY"] = "your_apikey"
embeddings = OpenAIEmbeddings()
res = embeddings.embed_query("hello world")2.1.3 Large Language Models
LangChain integrates many LLM providers, allowing developers to switch models easily.
2.2 Prompts
2.2.1 Prompt Templates
PromptTemplates let you dynamically construct prompts based on user input.
from langchain.llms import OpenAI
def generate_store_names(features):
template = "I am opening a new store with features {}. Suggest 10 names."
prompt = template.format(features)
llm = OpenAI()
response = llm.generate(prompt, max_tokens=10, temperature=0.8)
return [gen[0].text.strip() for gen in response.generations]
store_names = generate_store_names("fashion, creative, unique")
print(store_names)2.2.2 Few‑Shot Examples
Few‑shot prompting provides example pairs to guide the model.
from langchain import PromptTemplate, FewShotPromptTemplate
from langchain.llms import OpenAI
examples = [{"word": "黑", "antonym": "白"}, {"word": "伤心", "antonym": "开心"}]
example_template = """Word: {word}
Antonym: {antonym}
"""
example_prompt = PromptTemplate(input_variables=["word", "antonym"], template=example_template)
few_shot = FewShotPromptTemplate(
examples=examples,
example_prompt=example_prompt,
prefix="Provide the antonym for each word.",
suffix="Word: {input}
Antonym:",
input_variables=["input"],
example_separator="
"
)
prompt = few_shot.format(input="粗")
llm = OpenAI(temperature=0.9)
print(llm(prompt))2.2.3 Example Selectors
When many examples exist, ExampleSelector chooses the most informative subset, e.g., based on length.
from langchain.prompts import PromptTemplate, FewShotPromptTemplate
from langchain.prompts.example_selector import LengthBasedExampleSelector
examples = [{"word": "happy", "antonym": "sad"}, {"word": "tall", "antonym": "short"}, ...]
example_prompt = PromptTemplate(input_variables=["word", "antonym"], template="Word: {word}
Antonym: {antonym}
")
selector = LengthBasedExampleSelector(examples=examples, example_prompt=example_prompt, max_length=25)
dynamic_prompt = FewShotPromptTemplate(
example_selector=selector,
example_prompt=example_prompt,
prefix="Give the antonym of every input",
suffix="Word: {input}
Antonym:",
input_variables=["input"],
example_separator="
"
)
print(dynamic_prompt.format(input="big and huge and massive"))2.3 Indexes
Indexes structure documents for LLM interaction. Key sub‑components include:
Document Loaders – convert CSV, JSON, PDF, HTML, etc., into text.
Text Splitters – break long texts into manageable chunks (CharacterTextSplitter, RecursiveCharacterTextSplitter, MarkdownTextSplitter, etc.).
Vector Stores – store embeddings for similarity search (FAISS, Milvus, Pinecone, Chroma, AnalyticDB, Annoy, etc.).
Retrievers – expose a get_relevant_texts method to fetch relevant documents.
2.4 Chains
2.4.1 LLMChain
Wraps an LLM with a prompt template to produce a single output.
from langchain import PromptTemplate, OpenAI, LLMChain
prompt = "What is a good name for a company that makes {product}?"
llm = OpenAI(temperature=0)
chain = LLMChain(llm=llm, prompt=PromptTemplate.from_template(prompt))
print(chain.run({"product": "colorful socks"}))2.4.2 SimpleSequentialChain
Chains multiple LLMChains sequentially, passing the output of one as the input of the next.
from langchain import OpenAI, PromptTemplate, LLMChain, SimpleSequentialChain
# First chain – generate synopsis
synopsis_prompt = PromptTemplate.from_template("You are a playwright. Title: {title}
Synopsis:")
synopsis_chain = LLMChain(llm=OpenAI(temperature=0.7), prompt=synopsis_prompt)
# Second chain – write review
review_prompt = PromptTemplate.from_template("You are a NYT critic. Play synopsis: {synopsis}
Review:")
review_chain = LLMChain(llm=OpenAI(temperature=0.7), prompt=review_prompt)
overall = SimpleSequentialChain(chains=[synopsis_chain, review_chain], verbose=True)
print(overall.run("Tragedy at sunset on the beach"))2.4.3 SequentialChain
Allows multiple inputs/outputs with explicit variable naming.
# Synopsis chain with explicit output key
synopsis_chain = LLMChain(llm=OpenAI(temperature=0.7), prompt=PromptTemplate(...), output_key="synopsis")
# Review chain consumes "synopsis"
review_chain = LLMChain(llm=OpenAI(temperature=0.7), prompt=PromptTemplate(...), output_key="review")
overall = SequentialChain(chains=[synopsis_chain, review_chain])
print(overall({"title": "Tragedy at sunset on the beach", "era": "Victorian England"}))2.4.4 TransformChain
Applies a custom Python function to transform inputs before passing them to another chain.
def shorten(inputs: dict) -> dict:
text = inputs["text"]
short = "
".join(text.split("
")[:3])
return {"output_text": short}
transform = TransformChain(input_variables=["text"], output_variables=["output_text"], transform=shorten)
prompt = PromptTemplate(input_variables=["output_text"], template="Summarize this text:
{output_text}
Summary:")
llm_chain = LLMChain(llm=OpenAI(), prompt=prompt)
seq = SimpleSequentialChain(chains=[transform, llm_chain])
print(seq.run(long_document))2.5 Memory
ConversationBufferMemory – stores the full chat history in memory.
ConversationBufferWindowMemory – keeps only the last *k* turns.
ConversationTokenBufferMemory – evicts based on token count.
ConversationSummaryMemory – stores a summary of the conversation.
ConversationSummaryBufferMemory – combines summary with token‑based eviction.
VectorStoreRetrieverMemory – saves past turns in a vector DB and retrieves the most similar ones.
2.6 Agents
Agents decide which tools or chains to invoke based on user input.
2.6.1 Action Agents
At each step, the agent uses the outputs of previous actions to choose the next action.
2.6.2 Plan‑and‑Execute Agents
The agent first plans the entire sequence of steps and then executes them without replanning.
Receive user input.
Plan the full order of actions.
Execute actions sequentially, passing each output to the next.
3. LangChain Practical Projects
3.1 Simple Q&A
import os
os.environ["OPENAI_API_KEY"] = "your_api_key"
from langchain.llms import OpenAI
llm = OpenAI(model_name="text-davinci-003", max_tokens=1024)
print(llm("How do you evaluate artificial intelligence?"))3.2 Google Search via SerpAPI
import os
os.environ["OPENAI_API_KEY"] = "your_api_key"
os.environ["SERPAPI_API_KEY"] = "your_api_key"
from langchain.agents import load_tools, initialize_agent, AgentType
from langchain.llms import OpenAI
llm = OpenAI(temperature=0)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
print(agent.run("What is the date today? What happened on this day in history?"))3.3 Summarizing Long Texts
import os
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains.summarize import load_summarize_chain
from langchain import OpenAI
os.environ["OPENAI_API_KEY"] = "your_api_key"
loader = TextLoader("static/open.txt")
documents = loader.load()
splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=0)
chunks = splitter.split_documents(documents)
llm = OpenAI(model_name="text-davinci-003", max_tokens=1500)
chain = load_summarize_chain(llm, chain_type="refine", verbose=True)
print(chain.run(chunks))3.4 Building a Local Knowledge‑Base QA Bot
import os
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain import OpenAI, VectorDBQA
from langchain.document_loaders import DirectoryLoader
os.environ["OPENAI_API_KEY"] = "your_api_key"
loader = DirectoryLoader("static", glob="**/*.txt")
documents = loader.load()
splitter = CharacterTextSplitter(chunk_size=100, chunk_overlap=0)
split_docs = splitter.split_documents(documents)
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(split_docs, embeddings)
qa = VectorDBQA.from_chain_type(llm=OpenAI(), chain_type="stuff", vectorstore=vectorstore, return_source_documents=True)
print(qa({"query": "What is the annual revenue?"}))3.5 Persistent Vector Store with Chroma
from langchain.vectorstores import Chroma
# Persist embeddings
vectorstore = Chroma.from_documents(documents, embeddings, persist_directory="D:/vector_store")
vectorstore.persist()
# Load later
vectorstore = Chroma(persist_directory="D:/vector_store", embedding_function=embeddings)3.6 Open‑Source Projects Based on LangChain
Curated list of notable LangChain projects (e.g., LangChain‑ChatGLM‑Webui).
4. Conclusion
LangChain continues to evolve, expanding its capabilities for complex LLM workflows and real‑world problem solving. While the guide may not cover every nuance, it provides a solid foundation for developers to experiment and build powerful AI‑driven applications.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MoonWebTeam
Official account of MoonWebTeam. All members are former front‑end engineers from Tencent, and the account shares valuable team tech insights, reflections, and other information.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
