Databases 16 min read

How to Deploy and Query JD’s Open‑Source Vearch Vector Database for LLM Retrieval

This article walks through the practical use of JD’s self‑developed Vearch vector database—covering cluster creation, space setup, data insertion, and both text and vector search—illustrating how it integrates with LangChain and OpenAI embeddings to enable retrieval‑augmented generation for large language models.

JD Retail Technology

Jun 4, 2024

How to Deploy and Query JD’s Open‑Source Vearch Vector Database for LLM Retrieval

Background

Vector databases have become essential infrastructure for large‑model (LLM) applications such as Retrieval‑Augmented Generation (RAG). Vearch, an open‑source vector store originally built for deep‑learning workloads, offers distributed storage and similarity search capabilities and is already integrated into the LangChain framework.

Project Overview

The author’s team needed a knowledge base for a test‑case generation project. They chose Vearch because it provides a ready‑to‑use cluster, supports GPU‑accelerated vector computation, and avoids the need to manage GPU resources themselves.

Cluster Creation

Vearch consists of three components: Master (metadata management), Router (request routing and result merging), and PS (vector storage and retrieval). A cluster is provisioned through the internal console, yielding a Master address and a Router address. The Master handles schema operations, while the Router handles data insertion, deletion, and search.

Importing the SDK

Initially the project used the older langchain_community.vectorstores.vearch.Vearch SDK, which only supports fixed fields. The team switched to the latest GitHub SDK by copying the source file into their project and importing it with:

from ..(path)/Vearch_file import Vearch

Space (Table) Creation

A space named delta_llm_embedding is created via a curl -XPUT request to the Master address. The JSON payload defines three partitions, three replicas, the HNSW index with InnerProduct metric, and two fields: text (string) and text_embedding (vector, dimension 1536). The dimension matches the output of OpenAI’s text‑embedding‑ada‑002 model.

curl -XPUT -H "content-type: application/json" -d'{
    "name": "delta_llm_embedding",
    "partition_num": 3,
    "replica_num": 3,
    "engine": {
        "index_size": 1,
        "metric_type": "InnerProduct",
        "retrieval_type": "HNSW",
        "retrieval_param": {"nlinks": 32, "efSearch": 64, "efConstruction": 160}
    },
    "properties": {
        "text": {"type": "string", "index": true},
        "text_embedding": {
            "dimension": 1536,
            "type": "vector",
            "store_param": {"cache_size": 2048, "compress": {"rate": 16}}
        }
    }
}' http://master_server/space/db/_create

Data Insertion

Data can be inserted either by using the SDK’s vearch_cluster.from_documents helper (which creates the space automatically) or by manually constructing the payload. The SDK expects the field names text and text_embedding, so mismatched names cause errors. Example of manual insertion using the SDK’s loop:

for text, metadata, embed in zip(texts, metadatas, embeddings):
    profiles = {}
    profiles["text"] = text
    for f in meta_field_list:
        profiles[f] = metadata[f]
    embed_np = np.array(embed)
    profiles["text_embedding"] = {"feature": (embed_np / np.linalg.norm(embed_np)).tolist()}
    self.vearch.insert_one(self.using_db_name, self.using_table_name, profiles)

When the insertion succeeds, Vearch returns a JSON response containing _id, which can later be used for retrieval.

Retrieval

Two search modes are supported:

Text search – uses similarity_search, which first embeds the query with the same embedding model and then calls similarity_search_by_vector.

Vector search – directly supplies a pre‑computed embedding to similarity_search_by_vector.

Both methods compute cosine similarity on the stored vectors; higher dimensionality increases computational cost. Example code for text search:

question = "What marketing strategies are included?"
results = vearch_cluster.similarity_search(query=question, k=1)

Example code for vector search:

question_embedding = [0.0403, -0.0073, 0.0265, ...]
results = vearch_cluster.similarity_search_by_vector(query=question_embedding, k=1)

Summary

Vearch provides a functional, open‑source solution for vector storage and similarity search that integrates smoothly with LangChain and OpenAI embeddings. It meets the basic requirements of the project with acceptable performance and a low learning curve, though the API‑centric workflow can feel less convenient for users accustomed to traditional relational databases. The experience also shows that Vearch is already being used in other business scenarios such as recommendation and deduplication, indicating its broader applicability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python LangChain RAG vector database Embedding LLM Retrieval Vearch

Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.