Cloud Computing 17 min read

How to Build a Fast, Accurate AI‑Powered Knowledge Base with Amazon OpenSearch and DeepSeek

This article walks through using Amazon OpenSearch Service’s vector search and ML connector together with the DeepSeek large language model to create a low‑cost, high‑efficiency enterprise knowledge base, covering architecture, step‑by‑step deployment, RAG pipeline configuration, and conversational search extensions.

Amazon Cloud Developers

Feb 5, 2026

How to Build a Fast, Accurate AI‑Powered Knowledge Base with Amazon OpenSearch and DeepSeek

Problem Statement

Enterprise knowledge bases often suffer from information silos and low retrieval efficiency, limiting the usefulness of natural‑language queries.

Technical Solution

Combine Amazon OpenSearch Service vector indexing with the open‑source DeepSeek large language model (LLM) to implement Retrieval‑Augmented Generation (RAG). OpenSearch 2.11 introduces a built‑in RAG processor that can invoke external models via the ML Connector, perform vector similarity search, and pass retrieved snippets to an LLM for answer generation.

Key Components

ML Connector : registers a DeepSeek model and an embedding model as remote SageMaker endpoints.

Ingestion Pipeline : loads documents into an OpenSearch index and vectorizes them using the embedding model.

Search Pipeline (RAG processor) : defined as my-conversation-search-pipeline-deepseek-zh, executes a neural query, retrieves top‑k documents, builds a prompt, and calls the DeepSeek LLM.

Implementation Steps

Register the DeepSeek LLM and the embedding model with OpenSearch using the ML Connector and link them to SageMaker endpoints.

Create an Ingestion Pipeline that reads source documents, invokes the embedding connector, and stores the vectors in an index (e.g., opensearch_kl_index).

Configure the Search Pipeline to use the neural query type, specify the embedding model ID, and set the LLM parameters.

Clients send queries to the index via the Search Pipeline; the pipeline vectorizes the query, retrieves relevant documents, and forwards the combined context to DeepSeek.

RAG Processor Example

GET _plugins/_ml/models/<DeepSeekModelId>/_predict
{
  "parameters": {
    "inputs": "OpenSearch Serverless 是什么，和 OpenSearch 集群模式有什么区别，使用 OpenSearch Serverless，还需要管理服务器资源么？"
  }
}

Knowledge‑base query using the RAG processor:

GET opensearch_kl_index/_search?search_pipeline=my-conversation-search-pipeline-deepseek-zh
{
  "query": {
    "neural": {
      "text_embedding": {
        "query_text": "OpenSearch Serverless 是什么，和 OpenSearch 集群模式有什么区别，使用 OpenSearch Serverless，还需要管理服务器资源么？",
        "model_id": "<Embedding Model ID>",
        "k": 5
      }
    }
  },
  "size": 2,
  "_source": ["text"],
  "ext": {
    "generative_qa_parameters": {
      "llm_model": "bedrock/claude",
      "llm_question": "OpenSearch Serverless 是什么，和 OpenSearch 集群模式有什么区别，使用 OpenSearch Serverless，还需要管理服务器资源么？",
      "context_size": 5,
      "timeout": 100
    }
  }
}

The response contains two parts: (1) the retrieved document snippets and (2) the LLM‑generated answer that references those snippets.

Deployment Options

Notebook deployment : run the Jupyter notebook RAG-DeepSeek-AOS-deploy.ipynb in SageMaker Studio. Notebook URL: https://gitee.com/turk/opensearch-ds-rag-demo/blob/069a3a18c9b49f5cbf0e7f79b21b40394aea51b6/opensearch-rag/notebook/RAG-DeepSeek-AOS-deploy.ipynb

CloudFormation deployment : use the template opensearch-ml-connector-cf-fixed.yaml and the Python‑dependency zip to provision IAM roles, ML connectors, model registrations, a pre‑populated index, and the RAG search pipeline. Template URL: https://gitee.com/turk/opensearch-ds-rag-demo/blob/master/opensearch-rag/cfn/opensearch-ml-connector-cf-fixed.yaml<br/>Dependencies zip URL: https://gitee.com/turk/opensearch-ds-rag-demo/raw/master/opensearch-rag/cfn/python-dependencies.zip

CloudFormation Output (Key Resources)

Model IDs for the DeepSeek LLM and the embedding model.

Knowledge‑base index opensearch_kl_index with sample data.

Search Pipeline my-conversation-search-pipeline-deepseek-zh.

Verification Commands

POST _plugins/_ml/models/<DeepSeekModelId>/_predict
{
  "parameters": {"inputs": "..."}
}

GET /_search/pipeline/my-conversation-search-pipeline-deepseek-zh

GET opensearch_kl_index/_search?search_pipeline=my-conversation-search-pipeline-deepseek-zh
{...}

Demo Application

Clone the repository and set up a Python virtual environment:

git clone https://gitee.com/turk/opensearch-ds-rag-demo.git
cd opensearch-ds-rag-demo/opensearch-rag/qa_app
python3 -m venv .
source bin/activate
pip install -r requirements.txt
cp .env.example .env

Configure .env with the OpenSearch endpoint, credentials, index name, and model IDs:

OPENSEARCH_HOST=<opensearch-endpoint>
OPENSEARCH_PORT=443
OPENSEARCH_USER=<username>
OPENSEARCH_PASSWORD=<password>
OPENSEARCH_INDEX=opensearch_kl_index
OPENSEARCH_EMBEDDING_MODEL_ID=<embedding-model-id>

Run the single‑turn Q&A UI: python3 app.py For multi‑turn conversational search, create a memory store:

POST /_plugins/_ml/memory/
{ "name": "Conversation about DeepSeek Demo" }

Use the returned memory_id in .env (e.g., OPENSEARCH_MEMORY_ID=<memory-id>) and start the conversational demo: python3 app_conversational.py Example conversation:

Question: "OpenSearch 有支持收集日志的工具么？"

Follow‑up: "这个工具能收集哪些日志？"

Retrieve the stored messages:

GET /_plugins/_ml/memory/<memory-id>/messages

Summary of Technical Benefits

Vector indexing enables semantic similarity matching, overcoming keyword‑only limitations.

The RAG processor integrates retrieval and generation in a single search pipeline, reducing latency.

ML Connector abstracts model invocation, eliminating custom code for embedding and LLM calls.

Memory API provides context retention for multi‑turn dialogues.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

RAG Vector Search DeepSeek Knowledge Base Cloud Deployment Amazon OpenSearch

Written by

Amazon Cloud Developers

Official technical community of Amazon Cloud. Shares practical AI/ML, big data, database, modern app development, IoT content, offers comprehensive learning resources, hosts regular developer events, and continuously empowers developers.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.