An Introduction to LangChain: Core Components, Usage Patterns, and Practical Code Examples
This article explains what LangChain is, outlines its core components such as Models, Indexes, Chains, Memory and Agents, and demonstrates how to build LLM‑driven applications with detailed Python code snippets, visual diagrams, and future development suggestions.
What is LangChain
LangChain is an open‑source framework for building applications powered by large language models (LLMs). It can be thought of as the "Spring" for LLMs or a plug‑in system for ChatGPT, providing utilities that connect LLMs to external data sources and enable tool‑using agents.
Core Components
Models
LangChain does not ship its own LLMs; instead it defines a generic interface to any LLM. Two main model types are supported: LLM (text‑in, text‑out) and Chat Models (chat‑message list in/out). PromptTemplate helps construct reusable prompts.
from langchain import PromptTemplate
prompt_template = '''作为一个资深编辑,请针对 >>> 和 <<< 中间的文本写一段摘要。'''\n>>> {text} <<<
prompt = PromptTemplate(template=prompt_template, input_variables=["text"])
print(prompt.format_prompt(text="我爱北京天安门"))Indexes
Indexes integrate external data with LLMs. Typical steps include loading documents with Document Loaders, splitting text with Text Splitters, storing vectors in a Vectorstore, and retrieving relevant chunks.
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
model_name="gpt-3.5-turbo",
allowed_special="all",
separators=["\n\n", "\n", "。", ","],
chunk_size=7000,
chunk_overlap=0)
docs = text_splitter.create_documents(["文本在这里"])
print(docs)Vectorstore
Embeddings convert text into vectors for semantic search. Common vectorstores include FAISS and Chroma. Local embedding models (e.g., text2vec‑base‑chinese) can reduce API costs.
from langchain.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name="text2vec-base-chinese", cache_folder="本地模型地址")
embeddings = embeddings_model.embed_documents(["我爱北京天安门!", "Hello world!"])
print(embeddings)Chains
Chains link components together. Main chain types are LLMChain, SequentialChain, RouterChain, and Document‑focused chains (Stuff, Refine, MapReduce, MapRerank). Example of an LLMChain that extracts keywords and sentiment:
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.output_parsers import ResponseSchema, StructuredOutputParser
keyword_schema = ResponseSchema(name="keyword", description="评论的关键词列表")
emotion_schema = ResponseSchema(name="emotion", description="评论的情绪,正向为1,中性为0,负向为-1")
output_parser = StructuredOutputParser.from_response_schemas([keyword_schema, emotion_schema])
format_instructions = output_parser.get_format_instructions()
prompt_template_txt = '''作为资深客服,请针对 >>> 和 <<< 中间的文本识别其中的关键词,以及包含的情绪是正向、负向还是中性。\n>>> {text} <<<\nRESPONSE:\n{format_instructions}'''
prompt = PromptTemplate(template=prompt_template_txt, input_variables=["text"], partial_variables={"format_instructions": format_instructions})
llm_chain = LLMChain(prompt=prompt, llm=llm)
result = llm_chain.run(comment)
print(result)Memory
Memory stores conversation history so that chains can maintain context across turns. Examples include ConversationSummaryMemory, ConversationBufferWindowMemory, and ConversationBufferMemory.
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
conversation = ConversationChain(llm=llm, memory=memory, verbose=True)
print(conversation.predict(input="我的姓名是tiger"))
print(conversation.predict(input="1+1=?"))
print(conversation.predict(input="我的姓名是什么"))Agents
Agents act as the brain’s proxy, using tools to overcome LLM limitations. Popular open‑source agents include AutoGPT, BabyAGI, and AgentGPT. Agents are built by initializing with an AgentType (e.g., ZERO_SHOT_REACT_DESCRIPTION) and a set of tools.
from langchain.agents import load_tools, initialize_agent, tool
from langchain.agents.agent_types import AgentType
from datetime import date
@tool
def time(text: str) -> str:
"""返回今天的日期。"""
return str(date.today())
tools = load_tools(["llm-math"], llm=llm)
tools.append(time)
agent_math = initialize_agent(agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, tools=tools, llm=llm, verbose=True)
print(agent_math("计算45 * 54"))
print(agent_math("今天是哪天?"))Practical Applications
Examples include document summarization using a Refine chain, and Retrieval‑augmented QA with FAISS vectorstore and custom prompts.
# Summarize with Refine chain
loader = PlaywrightURLLoader(urls=["https://content.jr.jd.com/article/index.html?pageId=708258989"])
data = loader.load()
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(...)
chain = load_summarize_chain(llm, chain_type="refine", question_prompt=PROMPT, refine_prompt=REFINE_PROMPT)
result = chain.run(docs)
print(result)
# Retrieval QA
loader = WebBaseLoader("https://in.m.jd.com/help/app/register_info.html")
data = loader.load()
vectorstore = FAISS.from_documents(docs, embeddings)
qa_chain = RetrievalQA.from_chain_type(llm, retriever=vectorstore.as_retriever(), chain_type_kwargs={"prompt": QA_CHAIN_PROMPT})
result = qa_chain({"query": "用户注册资格"})
print(result["result"])Future Directions
LangChain will continue to evolve rapidly alongside LLM advances, with two notable trends: low‑code visual orchestration tools (e.g., LangFlow) and more powerful agents that can dramatically broaden LLM application scenarios.
JD Tech
Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.