Boost Elasticsearch Semantic Search with Alibaba Cloud AI: Step‑by‑Step Guide
This tutorial walks through configuring Alibaba Cloud AI services, creating sparse embedding and rerank endpoints, setting up Elasticsearch mappings, indexing Agatha Christie data, and combining semantic search, reranking, and completion APIs to achieve more relevant search results and a RAG‑style answer generation pipeline.
Author: Tomás Murúa (Elastic)
This article explains how to integrate Alibaba Cloud AI capabilities with Elasticsearch to improve semantic search relevance.
Steps
Create Elasticsearch mapping
Index data into Elasticsearch
Query data
Reward: complete answering questions
Configure Alibaba Cloud AI
Alibaba Cloud AI search provides semantic reranking and sparse embedding endpoints using models such as Qwen and DeepSeek‑R1.
Semantic reranking reorders results based on similarity between query and documents, while sparse embeddings highlight relevant information.
Obtain Alibaba Cloud API Key
Generate a valid API key from the Alibaba Cloud portal:
Access the Service Marketplace.
Navigate to the left‑hand menu API Keys .
Create a new API key.
Configure Alibaba Endpoints
Set up the sparse embedding endpoint:
PUT _inference/sparse_embedding/alibabacloud_ai_search_sparse
{
"service": "alibabacloud-ai-search",
"service_settings": {
"api_key": "<api_key>",
"service_id": "ops-text-sparse-embedding-001",
"host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace": "default"
}
}Set up the rerank endpoint:
PUT _inference/rerank/alibabacloud_ai_search_rerank
{
"service": "alibabacloud-ai-search",
"service_settings": {
"api_key": "<api_key>",
"service_id": "ops-bge-reranker-larger",
"host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace": "default"
}
}Create Elasticsearch Mapping
Define fields for semantic vectors and descriptions, using copy_to for hybrid search:
PUT arts
{
"mappings": {
"properties": {
"semantic_description": {
"type": "semantic_text",
"inference_id": "alibabacloud_ai_search_sparse"
},
"description": {
"type": "text",
"copy_to": "semantic_description"
}
}
}
}Index Data
Bulk‑index a set of Agatha Christie novel and play descriptions:
POST arts/_bulk
{ "index": {} }
{ "description": "Black Coffee is a play by the British crime‑fiction author Agatha Christie..." }
{ "index": {} }
{ "description": "The Mousetrap is a murder mystery play by Agatha Christie..." }
{ "index": {} }
{ "description": "The Body in the Murder is a Miss Marple mystery novel published by Agatha Christie in 1942..." }
{ "index": {} }
{ "description": "Curtain: Poirot's Last Case is Agatha Christie's last published novel..." }
{ "index": {} }
{ "description": "Death on the Nile is Agatha Christie's most daring travel mystery novel..." }
{ "index": {} }
{ "description": "The Murder of Roger Ackroyd was Agatha Christie’s first book published by William Collins..." }Semantic Search
Query the semantic_description field:
GET /arts/_search
{
"_source": { "includes": ["description"] },
"query": {
"semantic": {
"field": "semantic_description",
"query": "Which novel was written by Agatha Christie?"
}
}
}The response returns novel documents first, with plays appearing at the bottom.
Rerank Optimization
Use the rerank endpoint to improve ordering:
POST _inference/rerank/alibabacloud_ai_search_rerank
{
"query": "Which novel was written by Agatha Christie?",
"input": [
"Black Coffee is a play ...",
"The Mousetrap is a murder mystery play ...",
"The Body in the Murder is a Miss Marple mystery novel ...",
"Curtain: Poirot's Last Case ...",
"Death on the Nile ...",
"The Murder of Roger Ackroyd ..."
]
}Combined Retrieval and Rerank
Execute semantic search and reranking in a single request:
POST /arts/_search
{
"_source": { "includes": ["description"] },
"retriever": {
"text_similarity_reranker": {
"retriever": { "standard": { "query": { "semantic": { "field": "semantic_description", "query": "Which novel was written by Agatha Christie?" } } } },
"field": "description",
"rank_window_size": 10,
"inference_id": "alibabacloud_ai_search_rerank",
"inference_text": "Which novel was written by Agatha Christie?"
}
}
}Completion Endpoint for RAG
Create a completion endpoint using Alibaba Qwen or DeepSeek‑R1:
PUT _inference/completion/alibabacloud_ai_search_completion
{
"service": "alibabacloud-ai-search",
"service_settings": {
"host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
"api_key": "<api_key>",
"service_id": "ops-qwen-turbo",
"workspace": "default"
}
}Send the retrieved documents and the question to obtain a concise answer from the LLM.
Conclusion
Integrating Alibaba Cloud AI with Elasticsearch enables seamless use of embedding, rerank, and completion models, enhancing search pipelines and moving toward a full RAG solution.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
