Artificial Intelligence 14 min read

Build Chinese Vector Search with Alibaba Cloud AI and Elasticsearch Inference APIs

This guide walks you through creating sparse and dense vector inference endpoints on Elasticsearch using Alibaba Cloud AI services, demonstrates how to generate embeddings, perform completion, rerank results, and integrate RAG workflows for accurate Chinese‑language search.

Alibaba Cloud Big Data AI Platform

Dec 17, 2024

Build Chinese Vector Search with Alibaba Cloud AI and Elasticsearch Inference APIs

Sparse Vector

Create a sparse‑embedding inference endpoint:

PUT _inference/sparse_embedding/alibabacloud_ai_search_sparse
{
  "service": "alibabacloud-ai-search",
  "service_settings": {
    "api_key": "<api_key>",
    "service_id": "ops-text-sparse-embedding-001",
    "host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
    "workspace": "default"
  }
}

Test the endpoint with a Chinese sentence:

POST _inference/alibabacloud_ai_search_sparse
{
  "input": "阿里巴巴（中国）有限公司成立于2007年03月26日，法定代表人蒋芳"
}

The response shows a Unicode‑encoded sparse vector, different from the English‑only ELSER model.

Dense Vector

Create a dense‑embedding inference endpoint:

PUT _inference/text_embedding/alibabacloud_ai_search_embeddings
{
  "service": "alibabacloud-ai-search",
  "service_settings": {
    "api_key": "<api_key>",
    "service_id": "ops-text-embedding-001",
    "host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
    "workspace": "default"
  }
}

Generate a dense vector:

POST _inference/alibabacloud_ai_search_embeddings
{
  "input": "阿里巴巴（中国）有限公司成立于2007年03月26日，法定代表人蒋芳"
}

Dense vectors are floating‑point arrays that can be scalar‑quantized to save memory and speed up search.

Completion

Create a completion inference endpoint:

PUT _inference/completion/alibabacloud_ai_search_completion
{
  "service": "alibabacloud-ai-search",
  "service_settings": {
    "host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
    "api_key": "{{API_KEY}}",
    "service_id": "ops-qwen-turbo",
    "workspace": "default"
  }
}

Example request:

POST _inference/completion/alibabacloud_ai_search_completion
{
  "input": "阿里巴巴（中国）有限公司是什么时候成立的?"
}

The large model returns a natural‑language answer based on its training.

Rerank

Create a rerank inference endpoint:

PUT _inference/rerank/alibabacloud_ai_search_rerank
{
  "service": "alibabacloud-ai-search",
  "service_settings": {
    "api_key": "<api_key>",
    "service_id": "ops-bge-reranker-larger",
    "host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
    "workspace": "default"
  }
}

Provide two candidate documents and a query; the endpoint returns the document with higher relevance.

RAG Application

Store business or private data in Elasticsearch as a vector database. Search the index first, then feed the retrieved documents as context to the completion endpoint to avoid hallucinations.

PUT alibaba_sparse
{
  "mappings": {
    "properties": {
      "inference_field": {
        "type": "semantic_text",
        "inference_id": "alibabacloud_ai_search_sparse"
      }
    }
  }
}

Index two sample documents and run a semantic search:

GET alibaba_sparse/_search
{
  "query": {
    "semantic": {
      "field": "inference_field",
      "query": "阿里云是什么时候成立的？"
    }
  }
}

The result shows the Alibaba Cloud document ranked first.

Creating a Dense Index for RAG

PUT alibaba_dense
{
  "mappings": {
    "properties": {
      "inference_field": {
        "type": "semantic_text",
        "inference_id": "alibabacloud_ai_search_embeddings"
      }
    }
  }
}

After indexing similar documents, semantic searches for "Alibaba 的法人是谁？" correctly retrieve the Alibaba document, demonstrating cross‑language matching.

Automatic Chunking

The semantic_text field automatically splits long texts into chunks for vectorization. Create a large‑text index:

PUT alibaba_dense_large_text
{
  "mappings": {
    "properties": {
      "inference_field": {
        "type": "semantic_text",
        "inference_id": "alibabacloud_ai_search_embeddings"
      }
    }
  }
}

Insert a lengthy Alibaba Group description and query it; the system returns relevant chunks, confirming automatic chunking works.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Elasticsearch RAG AI Search Sparse Embedding rerank Completion Dense Embedding

Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.