How to Deploy and Query JD’s Open‑Source Vearch Vector Database for LLM Retrieval
This article walks through the practical use of JD’s self‑developed Vearch vector database—covering cluster creation, space setup, data insertion, and both text and vector search—illustrating how it integrates with LangChain and OpenAI embeddings to enable retrieval‑augmented generation for large language models.
Background
Vector databases have become essential infrastructure for large‑model (LLM) applications such as Retrieval‑Augmented Generation (RAG). Vearch, an open‑source vector store originally built for deep‑learning workloads, offers distributed storage and similarity search capabilities and is already integrated into the LangChain framework.
Project Overview
The author’s team needed a knowledge base for a test‑case generation project. They chose Vearch because it provides a ready‑to‑use cluster, supports GPU‑accelerated vector computation, and avoids the need to manage GPU resources themselves.
Cluster Creation
Vearch consists of three components: Master (metadata management), Router (request routing and result merging), and PS (vector storage and retrieval). A cluster is provisioned through the internal console, yielding a Master address and a Router address. The Master handles schema operations, while the Router handles data insertion, deletion, and search.
Importing the SDK
Initially the project used the older langchain_community.vectorstores.vearch.Vearch SDK, which only supports fixed fields. The team switched to the latest GitHub SDK by copying the source file into their project and importing it with:
from ..(path)/Vearch_file import VearchSpace (Table) Creation
A space named delta_llm_embedding is created via a curl -XPUT request to the Master address. The JSON payload defines three partitions, three replicas, the HNSW index with InnerProduct metric, and two fields: text (string) and text_embedding (vector, dimension 1536). The dimension matches the output of OpenAI’s text‑embedding‑ada‑002 model.
curl -XPUT -H "content-type: application/json" -d'{
"name": "delta_llm_embedding",
"partition_num": 3,
"replica_num": 3,
"engine": {
"index_size": 1,
"metric_type": "InnerProduct",
"retrieval_type": "HNSW",
"retrieval_param": {"nlinks": 32, "efSearch": 64, "efConstruction": 160}
},
"properties": {
"text": {"type": "string", "index": true},
"text_embedding": {
"dimension": 1536,
"type": "vector",
"store_param": {"cache_size": 2048, "compress": {"rate": 16}}
}
}
}' http://master_server/space/db/_createData Insertion
Data can be inserted either by using the SDK’s vearch_cluster.from_documents helper (which creates the space automatically) or by manually constructing the payload. The SDK expects the field names text and text_embedding, so mismatched names cause errors. Example of manual insertion using the SDK’s loop:
for text, metadata, embed in zip(texts, metadatas, embeddings):
profiles = {}
profiles["text"] = text
for f in meta_field_list:
profiles[f] = metadata[f]
embed_np = np.array(embed)
profiles["text_embedding"] = {"feature": (embed_np / np.linalg.norm(embed_np)).tolist()}
self.vearch.insert_one(self.using_db_name, self.using_table_name, profiles)When the insertion succeeds, Vearch returns a JSON response containing _id, which can later be used for retrieval.
Retrieval
Two search modes are supported:
Text search – uses similarity_search, which first embeds the query with the same embedding model and then calls similarity_search_by_vector.
Vector search – directly supplies a pre‑computed embedding to similarity_search_by_vector.
Both methods compute cosine similarity on the stored vectors; higher dimensionality increases computational cost. Example code for text search:
question = "What marketing strategies are included?"
results = vearch_cluster.similarity_search(query=question, k=1)Example code for vector search:
question_embedding = [0.0403, -0.0073, 0.0265, ...]
results = vearch_cluster.similarity_search_by_vector(query=question_embedding, k=1)Summary
Vearch provides a functional, open‑source solution for vector storage and similarity search that integrates smoothly with LangChain and OpenAI embeddings. It meets the basic requirements of the project with acceptable performance and a low learning curve, though the API‑centric workflow can feel less convenient for users accustomed to traditional relational databases. The experience also shows that Vearch is already being used in other business scenarios such as recommendation and deduplication, indicating its broader applicability.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
JD Retail Technology
Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
