Big Data 11 min read

Elasticsearch Overview: Lifecycle Management, Vector Search, NLP, and Deployment on the 360 Zhihui Cloud Platform

This article introduces Elasticsearch, explains its hot‑warm‑cold lifecycle management, demonstrates vector search and built‑in NLP capabilities, and describes how the 360 Zhihui Cloud Platform integrates these features with practical test cases and new visualization tools.

360 Smart Cloud

Nov 16, 2023

Elasticsearch Overview: Lifecycle Management, Vector Search, NLP, and Deployment on the 360 Zhihui Cloud Platform

1. Elasticsearch Introduction

Elasticsearch is a distributed search and analytics engine designed to handle a wide range of application scenarios. As a core component of the Elastic Stack (Elasticsearch, Kibana, Logstash, Beats), it provides centralized data storage, fast search, relevance tuning, powerful analytics, and easy horizontal scaling.

Typical use cases include search engines, log and infrastructure metric analysis, enterprise search, e‑commerce, and many more.

2. Popular Elasticsearch Features

2.1 Lifecycle‑Based Hot‑Warm‑Cold Separation

Index Lifecycle Management (ILM) was introduced in Elasticsearch 6.6 (beta) and officially released in 6.7. ILM helps manage indices through four phases:

Hot : Index is actively written and queried; placed on high‑performance nodes.

Warm : Index is read‑only but still queried; placed on moderate‑performance nodes.

Cold : Index is rarely queried; stored on nodes with large disk capacity and lower performance.

Delete : Index data is no longer needed and can be removed.

Test case : An ILM policy defining hot, cold, and delete phases.

Step 1 : Set data‑node roles.

Step 2 : Create an index template matching testlog-*.

Step 3 : Define the testlog_policy lifecycle policy (e.g., move to cold after 30 days or 50 GB, delete after 90 days).

Step 4 : Create testlog-000001 index and apply the policy.

2.2 Vector Search

Vector search leverages Machine Learning to embed unstructured data (text, images) into numeric vectors, enabling semantic search via Approximate Nearest Neighbor (ANN) algorithms. Compared with keyword search, vector search yields higher relevance and faster execution.

How it works : Traditional search relies on term frequency and lexical similarity, while vector search computes distances in an embedding space to find the nearest neighbors of a query vector.

Convert raw entities (e.g., songs, images, text) into vector embeddings.

Use distance metrics to measure similarity between vectors.

Apply ANN algorithms to retrieve the most relevant documents.

Test case : Create an index with a vector field, index test data, and run a knn query to retrieve the most similar documents.

2.3 Using NLP in Elasticsearch

Natural Language Processing (NLP) is a branch of Artificial Intelligence that enables computers to understand and respond to human language. Since Elasticsearch 8.0, built‑in NLP features such as named‑entity recognition, sentiment analysis, and text classification can be executed directly without external plugins.

Test case : Build an index with a custom analyzer and NLP filters, ingest sample documents, and perform a search that includes NLP analysis.

Note: For advanced NLP tasks, integration with external frameworks like spaCy or NLTK may be required.

3. Elasticsearch on the 360 Zhihui Cloud Platform

Platform URL : https://hulk.qihoo.net/user/es/online/list

3.1 New Version Support

The platform now offers Elasticsearch 8.x, which improves vector search, adds native support for modern NLP models, and simplifies data ingestion.

Benchmarking against version 7.15.2 shows storage and memory improvements, as well as 7‑16% lower 99th‑percentile latency for default, term, and phrase queries.

3.2 New Visualization Packages

Added cluster capacity usage, slow‑log, and monitoring entry points for easier management.

3.3 Rich Monitoring Dashboards

Integrated cluster‑level dashboards allow multi‑dimensional monitoring of key metrics.

3.4 Slow‑Log Display

Users can view slow‑log entries directly from the platform.

3.5 Custom Alert Rules

New alert rules let users subscribe to notifications for important cluster health metrics.

3.6 Enhanced Kibana Visual Controls

Kibana 8.x provides vector map and basemap support, field statistics, and various visualization types such as mosaic and waffle charts.

4. Future Outlook

4.1 High‑Capacity Low‑Cost Data Nodes

Plans to combine Polefs storage with ILM to offer inexpensive nodes for warm and cold data.

4.2 Backup and Restore

Future integration with S3‑based snapshot services will enable backup and restore capabilities.

4.3 Differentiated Initialization Scenarios

Provide preset configurations for write‑heavy, read‑light workloads, time‑series data, log storage, and high‑performance search use cases.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Elasticsearch vector search NLP cloud platform ILM

Written by

360 Smart Cloud

Official service account of 360 Smart Cloud, dedicated to building a high-quality, secure, highly available, convenient, and stable one‑stop cloud service platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.