Building a RAG Application with Baidu Vector Database and Qianfan Embedding
This tutorial walks through building a Retrieval‑Augmented Generation application by setting up Baidu’s Vector Database and Qianfan embedding service, configuring credentials, creating a document database and vector table, loading and chunking PDFs, generating embeddings, storing them, and performing scalar, vector and hybrid similarity searches, ready for integration with Wenxin LLM for answer generation.
This article provides a comprehensive tutorial on building a Retrieval-Augmented Generation (RAG) application using Baidu's vector database and Qianfan embedding service.
RAG Introduction
RAG is an advanced natural language processing method that combines information retrieval and text generation technologies to improve the performance of question-answering systems and chatbots. The workflow includes: Document Loading (loading documents from various sources as knowledge base), Document Splitting (dividing documents into smaller chunks for better retrieval accuracy), Embedding Generation (creating vector representations that capture semantic information), Writing to Vector Database (storing embeddings for efficient similarity search), Query Generation (generating relevant queries from user input), Document Retrieval (retrieving relevant documents using similarity search), Context Integration (combining retrieved content with original query), and Answer Generation (producing final responses using the augmented context).
Environment Setup
Vector Database Environment: Create a Baidu Vector Database instance through the console, ensuring the region and availability zone match the BCC VPC. Obtain the access address, account (root), and API key from the instance details page.
Qianfan Embedding Model: Enable the paid service for Qianfan Embedding model through the billing management console. Retrieve the Access Key and Secret Key from the personal center's security authentication settings.
Client Environment: Set up a BCC compute instance (c5 2c4g) with CentOS. Install Python 3.9 and required packages including langchain, pymochow (Baidu vector database SDK), qianfan (Baidu AI platform SDK), and pdfplumber for PDF document loading.
Implementation Code
The tutorial provides complete Python code for:
Configuration Setup: Create config.py with credentials and endpoint configuration
Database Creation: Create a "document" database using pymochow client
Table Creation: Create a "chunks" table with fields for id (UINT64, primary key), source (STRING), author (STRING), and vector (FLOAT_VECTOR, 384 dimensions). Configure HNSW vector index with L2 metric type.
Data Loading and Processing: Load PDF documents using PDFPlumberLoader, split text into chunks using RecursiveCharacterTextSplitter (chunk_size=384, chunk_overlap=0), generate embeddings using Qianfan Embedding API, and write vectorized data to the database.
Document Retrieval: Demonstrate three retrieval methods: Scalar-based retrieval (query by primary key), Vector-based retrieval (ANN search using HNSW algorithm), and Hybrid retrieval (combining vector similarity with scalar filters).
The code examples include detailed implementations for connecting to Baidu Vector Database, creating tables with appropriate schemas, loading and processing PDF documents, generating embeddings, and performing various types of similarity searches. The tutorial concludes by mentioning that context integration and answer generation can be implemented using Baidu's Wenxin large language models.
Sohu Tech Products
A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.