Artificial Intelligence 19 min read

An Overview of LangChain: Core Concepts, Components, and Practical Applications

This article introduces LangChain—a Python framework for building LLM‑driven applications—explains its core components such as models, indexes, chains, memory, and agents, and provides practical code examples for document summarization, retrieval‑augmented QA, and future development directions.

JD Retail Technology

Nov 14, 2023

An Overview of LangChain: Core Concepts, Components, and Practical Applications

What Is LangChain?

LangChain is a framework for developing applications powered by large language models (LLMs). It can be thought of as the "Spring" of the LLM ecosystem or an open‑source version of the ChatGPT plugin system. Its two core capabilities are connecting LLMs to external data sources and enabling interaction with LLMs and environments through agents and tools.

Core Components

Models

LangChain does not provide its own LLMs; instead, it offers a unified interface to access various LLM providers, making it easy to swap underlying models or define custom ones. Two main model categories are:

LLM : Takes a text string as input and returns a text string (e.g., OpenAI’s text-davinci-003).

Chat Models : Accept a list of chat messages and return chat messages (e.g., ChatGPT, Claude).

Prompt construction is simplified with PromptTemplate, which enables reusable prompts.

from langchain import PromptTemplate
prompt_template = '''作为一个资深编辑，请针对 >>> 和 <<< 中间的文本写一段摘要。 
>>> {text} <<<''' 
prompt = PromptTemplate(template=prompt_template, input_variables=["text"]) 
print(prompt.format_prompt(text="我爱北京天安门"))

Indexes

Indexes integrate external data sources to retrieve answers. The typical workflow includes loading documents, splitting text, storing vectors, and retrieving relevant chunks.

Document Loaders

Text Splitters

Vectorstores

Retrievers

from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
loader = WebBaseLoader("https://in.m.jd.com/help/app/register_info.html")
data = loader.load()
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    model_name="gpt-3.5-turbo",
    allowed_special="all",
    separators=["

", "
", "。", "，"],
    chunk_size=800,
    chunk_overlap=0)
docs = text_splitter.split_documents(data)

Chains

Chains link components together to simplify complex workflows. Major chain types include:

LLMChain : Combines a PromptTemplate, an LLM, and an OutputParser to produce structured outputs.

SequentialChain and SimpleSequentialChain : Execute multiple chains in a predefined order.

RouterChain : Dynamically selects the next chain based on input, using either a zero‑shot ReAct approach or embedding‑based routing.

from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.output_parsers import ResponseSchema, StructuredOutputParser
keyword_schema = ResponseSchema(name="keyword", description="评论的关键词列表")
emotion_schema = ResponseSchema(name="emotion", description="评论的情绪，正向为1，中性为0，负向为-1")
output_parser = StructuredOutputParser.from_response_schemas([keyword_schema, emotion_schema])
format_instructions = output_parser.get_format_instructions()
prompt_template_txt = '''作为资深客服，请针对 >>> 和 <<< 中间的文本识别其中的关键词，以及包含的情绪是正向、负向还是中性。
>>> {text} <<<
RESPONSE:
{format_instructions}''' 
prompt = PromptTemplate(template=prompt_template_txt, input_variables=["text"], partial_variables={"format_instructions": format_instructions})
llm_chain = LLMChain(prompt=prompt, llm=llm)
result = llm_chain.run(comment)
data = output_parser.parse(result)
print(f"keyword={data['keyword']}, emotion={data['emotion']}")

Memory

Memory components store conversation history, enabling multi‑turn interactions. Common memory types include ConversationSummaryMemory, ConversationBufferWindowMemory, and ConversationBufferMemory.

from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
conversation = ConversationChain(llm=llm, memory=memory, verbose=True)
print(conversation.predict(input="我的姓名是tiger"))
print(conversation.predict(input="1+1=?"))
print(conversation.predict(input="我的姓名是什么"))

Agents

Agents act as autonomous executors that can invoke tools to extend LLM capabilities. Typical components are the Agent itself, a set of Tools, optional ToolKits, and an Agent Executor. Agents can be initialized with different AgentType values such as ZERO_SHOT_REACT_DESCRIPTION or CHAT_CONVERSATIONAL_REACT_DESCRIPTION.

from langchain.agents import load_tools, initialize_agent, tool
from langchain.agents.agent_types import AgentType
from datetime import date
@tool
def time(text: str) -> str:
    """返回今天的日期。"""
    return str(date.today())
tools = load_tools(['llm-math'], llm=llm)
tools.append(time)
agent_math = initialize_agent(agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, tools=tools, llm=llm, verbose=True)
print(agent_math("计算45 * 54"))
print(agent_math("今天是哪天？"))

Practical Implementations

Document Summarization

Using a refine chain, documents are loaded, split, and iteratively summarized.

from langchain.prompts import PromptTemplate
from langchain.document_loaders import PlaywrightURLLoader
from langchain.chains.summarize import load_summarize_chain
from langchain.text_splitter import RecursiveCharacterTextSplitter
loader = PlaywrightURLLoader(urls=["https://content.jr.jd.com/article/index.html?pageId=708258989"])
data = loader.load()
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    model_name="gpt-3.5-turbo",
    allowed_special="all",
    separators=["

", "
", "。", "，"],
    chunk_size=7000,
    chunk_overlap=0)
PROMPT = PromptTemplate(template="""作为一个资深编辑，请针对 >>> 和 <<< 中间的文本写一段摘要。
>>> {text} <<<""", input_variables=["text"])
REFINE_PROMPT = PromptTemplate(template="""作为一个资深编辑，基于已有的一段摘要：{existing_answer}，针对 >>> 和 <<< 中间的文本完善现有的摘要。
>>> {text} <<<""", input_variables=["existing_answer", "text"])
chain = load_summarize_chain(llm, chain_type="refine", question_prompt=PROMPT, refine_prompt=REFINE_PROMPT, verbose=False)
docs = text_splitter.split_documents(data)
result = chain.run(docs)
print(result)

Retrieval‑Augmented QA

Documents are loaded, split, embedded with a local HuggingFace model, stored in a FAISS vectorstore, and queried via a customized QA prompt.

from langchain.chains import RetrievalQA
from langchain.document_loaders import WebBaseLoader
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from langchain.prompts import PromptTemplate
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
loader = WebBaseLoader("https://in.m.jd.com/help/app/register_info.html")
data = loader.load()
text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    model_name="gpt-3.5-turbo",
    allowed_special="all",
    separators=["

", "
", "。", "，"],
    chunk_size=800,
    chunk_overlap=0)
docs = text_splitter.split_documents(data)
embeddings = HuggingFaceEmbeddings(model_name="text2vec-base-chinese", cache_folder="model")
vectorstore = FAISS.from_documents(docs, embeddings)
template = """请使用下面提供的背景信息来回答最后的问题。 如果你不知道答案，请直接说不知道，不要试图凭空编造答案。
回答时最多使用三个句子，保持回答尽可能简洁。 回答结束时，请一定要说\"谢谢你的提问！\"
{context}
问题: {question}
有用的回答:"""
QA_CHAIN_PROMPT = PromptTemplate(input_variables=["context", "question"], template=template)
qa_chain = RetrievalQA.from_chain_type(llm, retriever=vectorstore.as_retriever(), return_source_documents=True, chain_type_kwargs={"prompt": QA_CHAIN_PROMPT})
result = qa_chain({"query": "用户注册资格"})
print(result["result"])
print(len(result["source_documents"]))

Future Directions

LangChain’s rapid development and active community (over 1,200 contributors) suggest two major growth areas: low‑code visual tools such as LangFlow that further lower the barrier to LLM app development, and more powerful agents that act as the “SQL” of large language models, dramatically expanding application scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python LLM LangChain Agents PromptTemplate VectorStore

Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.