Elasticsearch Overview: Lifecycle Management, Vector Search, NLP, and Deployment on the 360 Zhihui Cloud Platform
This article introduces Elasticsearch, explains its hot‑warm‑cold lifecycle management, demonstrates vector search and built‑in NLP capabilities, and describes how the 360 Zhihui Cloud Platform integrates these features with practical test cases and new visualization tools.
1. Elasticsearch Introduction
Elasticsearch is a distributed search and analytics engine designed to handle a wide range of application scenarios. As a core component of the Elastic Stack (Elasticsearch, Kibana, Logstash, Beats), it provides centralized data storage, fast search, relevance tuning, powerful analytics, and easy horizontal scaling.
Typical use cases include search engines, log and infrastructure metric analysis, enterprise search, e‑commerce, and many more.
2. Popular Elasticsearch Features
2.1 Lifecycle‑Based Hot‑Warm‑Cold Separation
Index Lifecycle Management (ILM) was introduced in Elasticsearch 6.6 (beta) and officially released in 6.7. ILM helps manage indices through four phases:
Hot : Index is actively written and queried; placed on high‑performance nodes.
Warm : Index is read‑only but still queried; placed on moderate‑performance nodes.
Cold : Index is rarely queried; stored on nodes with large disk capacity and lower performance.
Delete : Index data is no longer needed and can be removed.
Test case : An ILM policy defining hot, cold, and delete phases.
Step 1 : Set data‑node roles.
Step 2 : Create an index template matching testlog-* .
Step 3 : Define the testlog_policy lifecycle policy (e.g., move to cold after 30 days or 50 GB, delete after 90 days).
Step 4 : Create testlog-000001 index and apply the policy.
2.2 Vector Search
Vector search leverages Machine Learning to embed unstructured data (text, images) into numeric vectors, enabling semantic search via Approximate Nearest Neighbor (ANN) algorithms. Compared with keyword search, vector search yields higher relevance and faster execution.
How it works : Traditional search relies on term frequency and lexical similarity, while vector search computes distances in an embedding space to find the nearest neighbors of a query vector.
Convert raw entities (e.g., songs, images, text) into vector embeddings.
Use distance metrics to measure similarity between vectors.
Apply ANN algorithms to retrieve the most relevant documents.
Test case : Create an index with a vector field, index test data, and run a knn query to retrieve the most similar documents.
2.3 Using NLP in Elasticsearch
Natural Language Processing (NLP) is a branch of Artificial Intelligence that enables computers to understand and respond to human language. Since Elasticsearch 8.0, built‑in NLP features such as named‑entity recognition, sentiment analysis, and text classification can be executed directly without external plugins.
Test case : Build an index with a custom analyzer and NLP filters, ingest sample documents, and perform a search that includes NLP analysis.
Note: For advanced NLP tasks, integration with external frameworks like spaCy or NLTK may be required.
3. Elasticsearch on the 360 Zhihui Cloud Platform
Platform URL : https://hulk.qihoo.net/user/es/online/list
3.1 New Version Support
The platform now offers Elasticsearch 8.x, which improves vector search, adds native support for modern NLP models, and simplifies data ingestion.
Benchmarking against version 7.15.2 shows storage and memory improvements, as well as 7‑16% lower 99th‑percentile latency for default, term, and phrase queries.
3.2 New Visualization Packages
Added cluster capacity usage, slow‑log, and monitoring entry points for easier management.
3.3 Rich Monitoring Dashboards
Integrated cluster‑level dashboards allow multi‑dimensional monitoring of key metrics.
3.4 Slow‑Log Display
Users can view slow‑log entries directly from the platform.
3.5 Custom Alert Rules
New alert rules let users subscribe to notifications for important cluster health metrics.
3.6 Enhanced Kibana Visual Controls
Kibana 8.x provides vector map and basemap support, field statistics, and various visualization types such as mosaic and waffle charts.
4. Future Outlook
4.1 High‑Capacity Low‑Cost Data Nodes
Plans to combine Polefs storage with ILM to offer inexpensive nodes for warm and cold data.
4.2 Backup and Restore
Future integration with S3‑based snapshot services will enable backup and restore capabilities.
4.3 Differentiated Initialization Scenarios
Provide preset configurations for write‑heavy, read‑light workloads, time‑series data, log storage, and high‑performance search use cases.
360 Smart Cloud
Official service account of 360 Smart Cloud, dedicated to building a high-quality, secure, highly available, convenient, and stable one‑stop cloud service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.