Integrate Alibaba Cloud AI Search with Elasticsearch: A Step‑by‑Step Guide

This tutorial walks you through configuring Elasticsearch’s Open Inference API to connect with Alibaba Cloud AI Search, covering setup of text generation, rerank, sparse and dense vector services, and demonstrates end‑to‑end requests with code examples for building RAG and semantic search applications.

Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Integrate Alibaba Cloud AI Search with Elasticsearch: A Step‑by‑Step Guide

Integrating Alibaba Cloud AI Search with Elasticsearch

Elastic recently opened its inference API to integrate with Alibaba Cloud AI Search, allowing Elasticsearch users to store and query dense and sparse vectors generated by models hosted on the Alibaba Cloud AI Search platform. The integration also supports semantic rerank models such as Tongyi Qianwen.

Prerequisites

You need an Alibaba Cloud account, a workspace, and an API key for the AI Search platform.

1. Create an Inference API Endpoint for Text Embedding

Use the alibabacloud-ai-search service in Elasticsearch and configure the endpoint with your workspace, host, and service ID.

PUT _inference/text_embedding/ali_ai_embeddings
{
    "service": "alibabacloud-ai-search",
    "service_settings": {
        "api_key": "<api_key>",
        "service_id": "ops-text-embedding-001",
        "host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
        "workspace": "default"
    }
}

The response includes the created endpoint details, such as inference_id, task_type, dimensions, and similarity metric.

{
  "inference_id": "ali_ai_embeddings",
  "task_type": "text_embedding",
  "service": "alibabacloud-ai-search",
  "service_settings": {
    "similarity": "dot_product",
    "dimensions": 1536,
    "service_id": "ops-text-embedding-001",
    "host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
    "workspace": "default",
    "rate_limit": {"requests_per_minute": 10000}
  },
  "task_settings": {}
}

Test the endpoint with a simple POST request:

POST _inference/text_embedding/ali_ai_embeddings
{
  "input": "What is Elastic?"
}

The API returns a dense vector for the input text.

{
    "text_embedding": [
        {
            "embedding": [0.048400473, 0.051464397, … , -0.008986305]
        }
    ]
}

2. Conversation Generation (Chat Completion)

Configure a chat completion service using the same Alibaba Cloud AI Search backend.

PUT _inference/completion/ali-chat
{
  "service": "alibabacloud-ai-search",
  "service_settings": {
    "host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
    "api_key": "xxxxxxxxxxxxxxxxxx",
    "service_id": "ops-qwen-turbo",
    "workspace": "default"
  }
}

Send a POST request with an input array to generate a response.

POST _inference/completion/ali-chat
{
  "input": ["Where is the capital of Henan?"]
}

The response contains the generated answer. History can be included in the input array for multi‑turn conversations.

3. Semantic Rerank

Configure a rerank service to reorder search results based on semantic relevance.

PUT _inference/rerank/ali-rank
{
  "service": "alibabacloud-ai-search",
  "service_settings": {
    "api_key": "xxxxxxxxxxxxxxxxxx",
    "service_id": "ops-bge-reranker-larger",
    "host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
    "workspace": "default"
  }
}

Submit a POST request with a query and an array of candidate texts.

POST _inference/rerank/ali-rank
{
  "query": "What is the capital of the USA?",
  "input": [
    "Carson City is the capital city of Nevada...",
    "Capital punishment ...",
    "The Commonwealth of the Northern Mariana Islands ...",
    "Washington, D.C. is the capital of the United States.",
    "Charlotte Amalie is the capital of the US Virgin Islands.",
    "North Dakota ... Bismarck."
  ]
}

The API returns relevance scores and the index of each input, with the most relevant result first.

4. Sparse Vector Generation

Set up a sparse embedding service using the service ID ops-text-sparse-embedding-001 .

PUT _inference/sparse_embedding/ali-sparse-embedding
{
  "service": "alibabacloud-ai-search",
  "service_settings": {
    "api_key": "xxxxxxxxxxxxxxxxxx",
    "service_id": "ops-text-sparse-embedding-001",
    "host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
    "workspace": "default"
  }
}

Example request:

POST _inference/sparse_embedding/ali-sparse-embedding
{
  "input": "Hello world",
  "task_settings": {
    "input_type": "search",
    "return_token": true
  }
}

Response includes a token‑level sparse embedding.

5. Text Embedding (Dense Vector)

Configure a dense text embedding endpoint similarly.

PUT _inference/text_embedding/ali-embeddings
{
  "service": "alibabacloud-ai-search",
  "service_settings": {
    "api_key": "xxxxxxxxxxxxxxxxxx",
    "service_id": "ops-text-embedding-001",
    "host": "xxxxx.platform-cn-shanghai.opensearch.aliyuncs.com",
    "workspace": "default"
  }
}

POST request returns a dense vector for the given text.

Conclusion

By linking Elasticsearch with Alibaba Cloud AI Search, developers can enhance hybrid search, semantic reranking, and RAG applications with powerful AI models, opening new possibilities for search‑related workloads.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

ElasticsearchRAGsemantic searchVector EmbeddingAlibaba Cloud AI SearchInference API
Alibaba Cloud Big Data AI Platform
Written by

Alibaba Cloud Big Data AI Platform

The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.