Understanding Retrieval-Augmented Generation (RAG) and Building a Personal Knowledge Base with ERNIE SDK and LangChain
The article explains Retrieval-Augmented Generation (RAG), its workflow, advantages, comparison with fine-tuning, and provides a step-by-step implementation using Baidu's ERNIE SDK, LangChain, and ChromaDB to build a personal knowledge base that answers queries with retrieved context.
RAG (Retrieval-Augmented Generation) is a technique introduced by Facebook AI Research in 2020, combining a pretrained retriever with a seq2seq generator to improve knowledge‑intensive NLP tasks.
The workflow consists of three steps: Retrieval, Utilization, Generation. An analogy with writing about puppies illustrates the process.
Typical scenarios for RAG include QA systems, document generation and automatic summarization, intelligent assistants, information retrieval, and knowledge‑graph population.
Advantages of RAG: leveraging external knowledge, timely data updates, explainability, high customizability, security & privacy control, and reduced training cost.
Compared with fine‑tuning, RAG offers greater generality, knowledge citation, instant updates, and interpretability, while fine‑tuning may excel on specific tasks but requires more data and training.
Implementation example using ERNIE SDK and LangChain to build a personal knowledge base:
!pip install --upgrade erniebot
import erniebot
erniebot.api_type = "aistudio"
erniebot.access_token = "<你的token>"
response = erniebot.Embedding.create(
model="ernie-text-embedding",
input=["..."]
)
print(response.get_result()) !pip install chromadb import os
import erniebot
from typing import Dict, List, Optional
import chromadb
from chromadb.api.types import Documents, EmbeddingFunction, Embeddings
def embed_query(content):
response = erniebot.embedding.create(
model="ernie-text-embedding",
input=[content])
result = response.get_result()
print(result)
return result
class ErnieEmbeddingFunction(EmbeddingFunction):
def __call__(self, input: Documents) -> Embeddings:
embeddings = []
for text in input:
response = embed_query(text)
try:
embedding = response[0]
embeddings.append(embedding)
except (IndexError, TypeError, KeyError) as e:
print(f"Error processing text: {text}, Error: {e}")
return embeddings
chroma_client = chromadb.PersistentClient(path="chromac")
collection = chroma_client.create_collection(name="demo", embedding_function=ErnieEmbeddingFunction())
print(collection) from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.document_loaders import TextLoader
loader = TextLoader('./AI大课逐字稿.txt', encoding='utf-8')
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=600, chunk_overlap=20)
docs = text_splitter.split_documents(documents) docs_list = []
metadatas = []
ids = []
for item in docs:
docs_list.append(item.page_content)
metadatas.append({"source": "AI大课逐字稿"})
ids.append(str(uuid.uuid4()))
collection.add(documents=docs_list, metadatas=metadatas, ids=ids) query = "讲师说见VC有两种错误的思维方式,分别是什么"
results = collection.query(query_texts=[query], n_results=2)
content = results['documents'][0]
prompt = f"""用户问题:{query}
{content}
根据
里的知识点回答用户问题"""
response = erniebot.ChatCompletion.create(
model="ernie-4.0",
messages=[{"role":"user","content":prompt}]
)
print(response.get_result()) def main(query):
results = collection.query(query_texts=[query], n_results=2)
content = results['documents'][0]
prompt = f"""用户问题:{query}
{content}
根据
里的知识点回答用户问题"""
response = erniebot.ChatCompletion.create(
model="ernie-4.0",
messages=[{"role":"user","content":prompt}]
)
return response.get_result()
query = input("请输入您要查询的问题:")
print(main(query))The full code is available at https://aistudio.baidu.com/projectdetail/7431640 .
Baidu Geek Talk
Follow us to discover more Baidu tech insights.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.