Boost Text2SQL Accuracy with Retrieval‑Augmented Generation and LangChain
This article explains how Retrieval‑Augmented Generation (RAG) can improve LLM‑based Text2SQL conversion, covering RAG fundamentals, LangChain implementation steps, practical enhancements for SQL agents, and future directions for integrating domain knowledge.
1. Introduction
In the previous article the author introduced an LLM Agent for Text2SQL and mentioned Retrieval‑Augmented Generation (RAG) as another way to improve large‑model applications. This article explores four aspects of using RAG to enhance LLM‑based Text2SQL conversion.
2. RAG Overview
2.1 Concept
RAG (Retrieval‑Augmented Generation) first retrieves relevant information from a large document collection and then feeds it to the LLM, improving answer quality and reducing hallucinations. It allows developers to attach external knowledge bases without retraining the whole model.
2.2 Problems RAG Solves for LLMs
Model hallucination due to lack of factual grounding.
Knowledge cutoff and inability to access real‑time or private data.
Data‑security concerns when using cloud LLMs.
2.3 Architecture
RAG consists of a retrieval step (vector database) and a generation step (LLM with an augmented prompt). The workflow includes data preparation (extraction, splitting, embedding, storage) and the application phase (query, retrieval, prompt injection, answer generation).
2.3.1 Data Preparation
Offline process: data extraction, text splitting, embedding, and storing vectors in a database such as FAISS, Chroma, Milvus, Elasticsearch, etc.
Data extraction – load various formats, filter, compress, and extract metadata.
Text splitting – consider token limits and semantic completeness; methods include sentence splitting, fixed‑length splitting with overlap, etc.
Embedding – convert text to vectors using models like OpenAI embeddings; fine‑tune for specialized vocabularies if needed.
Data storage – write vectors to a vector store (FAISS, Chroma, Milvus, Elasticsearch, …).
2.3.2 Application Phase
When a user asks a question, the system retrieves the most relevant chunks, injects them into a prompt, and lets the LLM generate an answer.
Retrieval – similarity search or full‑text search in the vector store.
Prompt injection – combine task description, retrieved context, and user query; optionally add extra instructions.
Generation – LLM produces concise, context‑aware output.
3. Implementing RAG with LangChain
3.1 LangChain RAG Modules
LangChain provides components to build a RAG pipeline, demonstrated with a simple web‑text QA app.
3.2 Step‑by‑Step Code
Load documents with
WebBaseLoader, split them using
RecursiveCharacterTextSplitter, embed with
OpenAIEmbeddings, store in
Chroma, retrieve with
as_retriever, create a prompt template, and run the chain with
ChatOpenAI. The full code snippets are shown below.
<code>import bs4
from langchain_community.document_loaders import WebBaseLoader
# ... (rest of code) ...
</code>4. Using RAG to Optimize Text2SQL
4.1 Existing Issues
Previous Text2SQL agent missed the requirement of “customers who placed orders in two consecutive months”.
4.2 RAG Solution
Introduce a few‑shot example database stored in a vector store, retrieve similar examples, and let the agent incorporate them via a custom tool.
<code>few_shots = {
"How many employees are there?": "SELECT COUNT(*) FROM employee;",
"在订单表中,连续两个月都下订单的客户有哪些?": "SELECT ...",
# ... other examples ...
}
</code>Build a FAISS retriever, create a
retriever_tool, and add it to the SQL agent.
<code>tool_description = """This tool will help you understand similar examples to adapt them to the user question in Chinese."""
retriever_tool = create_retriever_tool(retriever, name="sql_get_similar_examples", description=tool_description)
custom_tool_list = [retriever_tool]
# ... create_sql_agent with extra_tools and custom suffix ...
</code>Running the agent with the query “连续两个月都有下订单的客户有哪些?” returns correct SQL and results, as shown in the screenshots.
4.3 Additional Queries
Changing the wording of the question still yields accurate SQL thanks to the few‑shot retrieval.
5. Future Plans
The article concludes that RAG effectively mitigates LLM hallucinations and knowledge gaps for Text2SQL tasks. Future work includes enriching metadata, integrating domain‑specific knowledge bases, and further automating prompt engineering.
References: arXiv paper, LangChain documentation, Zhihu articles.
Data Thinking Notes
Sharing insights on data architecture, governance, and middle platforms, exploring AI in data, and linking data with business scenarios.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.