Building a Retrieval‑Augmented Generation QA Bot to Keep LLMs Up‑to‑Date

This article explains how to create a RAG‑based intelligent QA system that fetches the latest documentation (e.g., PlantUML) before querying Gemini, detailing knowledge‑base creation, embedding, vector store management, LangChain integration, and deployment tips.

Ops Development & AI Practice
Ops Development & AI Practice
Ops Development & AI Practice
Building a Retrieval‑Augmented Generation QA Bot to Keep LLMs Up‑to‑Date

Large language models often suffer from outdated knowledge, which can lead to incorrect or deprecated outputs. To address this, the author built a Retrieval‑Augmented Generation (RAG) system that automatically pulls the newest official documents and feeds them to Gemini before answering queries.

RAG Concept Overview

The RAG workflow mimics giving an intern the latest reports before writing a paper: a retrieval step extracts relevant passages from a knowledge base (PDFs, web pages, etc.), then the LLM generates answers using both the user question and the retrieved context.

Implementation Details

1. Knowledge Base Creation and Freshness

PDF source : The system starts with a PDF (e.g., PlantUML reference).

Automatic update detection : get_pdf_hash computes the SHA‑256 hash of the PDF. At startup, load_or_create_vectorstore compares the stored hash in chroma_db with the current hash.

If the hashes match, the existing vector store is loaded instantly; otherwise the old store is deleted, the PDF is re‑loaded with PyPDFLoader, split into chunks via RecursiveCharacterTextSplitter, and re‑indexed.

2. Embedding Generation

Text chunks are transformed into dense vectors using the HuggingFace model sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2. The setup_embeddings function initializes the model and returns an embedding object, which stores vectors in a Chroma vector database.

3. Retrieval‑Augmented Generation Chain

The LangChain RetrievalQA chain connects the vector store and the LLM. The create_qa_chain function builds a retriever ( vectorstore.as_retriever(search_kwargs={"k": 2})) and configures the chain with chain_type="stuff" and return_source_documents=True for transparency.

4. End‑to‑End Execution

The main function orchestrates the workflow: initialize embeddings, load/create the vector store, set up the LLM, create the QA chain, and finally ask a sample question (e.g., "How to draw a JSON data diagram?"). The system retrieves relevant PDF fragments, sends them plus the query to Gemini, and returns an answer with source citations.

Practical Results

Running the script yields answers that incorporate the latest PlantUML syntax, eliminating the problem of the model using deprecated commands.

Future Extensions

Swap PDF_FILE_PATH for any other PDF (Ethereum whitepaper, API docs, personal notes).

Support additional loaders for .txt, .md, or web crawling.

Wrap the system as a FastAPI/Flask service for broader consumption.

Explore advanced retrievers like Parent Document Retriever for large corpora.

Conclusion

RAG equips LLMs with a continuously updatable external knowledge source, improving answer accuracy and traceability while keeping implementation complexity low.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

LLMLangChainRAGEmbeddingGeminiAI assistantVector Store
Ops Development & AI Practice
Written by

Ops Development & AI Practice

DevSecOps engineer sharing experiences and insights on AI, Web3, and Claude code development. Aims to help solve technical challenges, improve development efficiency, and grow through community interaction. Feel free to comment and discuss.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.