Deep Inspection Checklist for Elasticsearch, Filebeat, Logstash, and Kibana
This article presents a comprehensive step‑by‑step guide for deeply inspecting the health, performance, and configuration of Elasticsearch, Filebeat, Logstash, and Kibana, including API calls, key metrics, DSL queries, Python automation scripts, and cron scheduling to ensure stable EFLK clusters.
Ensuring the stability of an Elasticsearch‑Filebeat‑Logstash‑Kibana (EFLK) stack is critical for both operations and development teams.
Elasticsearch Deep Inspection
1. Cluster health check
Use the GET _cluster/health API to retrieve overall health. Important fields include status (green/yellow/red), number_of_nodes, active_primary_shards, active_shards, and unassigned_shards.
2. Node performance monitoring
Query GET _nodes/stats and focus on indices.docs.count, indices.store.size_in_bytes, jvm.mem.heap_used_percent, os.cpu.percent, and fs.total.available_in_bytes.
3. Shard status monitoring
Run
GET _cat/shards?v&h=index,shard,prirep,state,unassigned.reasonand verify that unassigned.reason is empty. If shards are unassigned, re‑allocate with:
POST /_cluster/reroute
{
"commands": [
{
"allocate_stale_primary": {
"index": "your-index",
"shard": 0,
"node": "node-name",
"accept_data_loss": true
}
}
]
}4. Index status inspection
Execute
GET _cat/indices?v&h=index,health,status,pri,rep,docs.count,store.sizeand review fields such as health, status, pri, rep, docs.count, and store.size.
5. Cluster performance analysis with profile queries
Add the profile: true parameter to a search request to obtain execution time per query phase, helping locate bottlenecks.
GET /your-index/_search?pretty
{
"profile": true,
"query": {
"match": {"field": "value"}
}
}The response includes timing for each shard‑level operation.
Filebeat Inspection
1. Configuration check
Verify installation with systemctl status filebeat and inspect /etc/filebeat/filebeat.yml for correct input sources and output destinations.
2. Log inspection
Tail the log file ( /var/log/filebeat/filebeat) to ensure no errors such as connection failures or permission issues.
tail -f /var/log/filebeat/filebeat3. Configuration testing
Run filebeat test config -e to validate the config and output debug logs.
filebeat -eLogstash Inspection
1. Process check
Confirm Logstash is running via systemctl status logstash.
systemctl status logstash2. Pipeline configuration check
Review files under /etc/logstash/conf.d/ and ensure correct input, filter, and output sections.
cat /etc/logstash/conf.d/*.conf3. Log inspection
Tail /var/log/logstash/logstash-plain.log and look for connection failures or Grok parsing errors.
tail -f /var/log/logstash/logstash-plain.logKibana Inspection
1. Process status
Check Kibana service with systemctl status kibana.
systemctl status kibana2. Configuration check
Inspect /config/kibana.yml for correct elasticsearch.hosts and server.host settings.
cat /config/kibana.yml3. Log inspection
Tail /logs/kibana.log and verify connectivity to Elasticsearch and successful startup.
tail -f logs/kibana.log4. UI verification
Discover : search and view log data.
Dashboards : ensure proper rendering.
Visualizations : confirm charts load correctly.
DSL Query Examples
1. Slow query logs
GET /_search
{
"query": {"range": {"@timestamp": {"gte": "now-1d/d", "lt": "now/d"}}},
"sort": [{"took": {"order": "desc"}}],
"size": 10
}2. Error logs
GET /filebeat-*/_search
{
"query": {"match": {"message": "error"}}
}3. Node‑specific performance logs
GET /_search
{
"query": {"term": {"host.name": {"value": "node-1"}}}
}Enterprise‑Level Automation: Monitoring Elasticsearch with Python
1. Metric collection script
The script uses the requests library and the official elasticsearch Python client to fetch cluster health and node statistics, then logs them as JSON.
import json
from datetime import datetime
import configparser
import warnings
from elasticsearch import Elasticsearch
warnings.filterwarnings("ignore")
def init_es_client(config_path='./conf/config.ini'):
"""Initialize and return an Elasticsearch client"""
config = configparser.ConfigParser()
config.read(config_path)
es_host = config.get('elasticsearch', 'ES_HOST')
es_user = config.get('elasticsearch', 'ES_USER')
es_password = config.get('elasticsearch', 'ES_PASSWORD')
es = Elasticsearch(
hosts=[es_host],
basic_auth=(es_user, es_password),
verify_certs=False,
ca_certs='conf/http_ca.crt'
)
return es
LOG_FILE = 'elasticsearch_metrics.log'
es = init_es_client()
def get_cluster_health():
return es.cluster.health().body
def get_node_stats():
return es.nodes.stats().body
def get_cluster_metrics():
metrics = {}
cluster_health = get_cluster_health()
metrics['cluster_health'] = cluster_health
node_stats = get_node_stats()
nodes = node_stats.get('nodes', {})
metrics['nodes'] = {}
for node_id, node_info in nodes.items():
node_name = node_info.get('name')
metrics['nodes'][node_name] = {
'cpu_usage': node_info['os']['cpu']['percent'],
'load_average': node_info['os']['cpu'].get('load_average', {}).get('1m'),
'memory_used': node_info['os']['mem']['used_percent'],
'heap_used': node_info['jvm']['mem']['heap_used_percent'],
'disk_available': node_info['fs']['total']['available_in_bytes'] / (1024 ** 3),
'disk_total': node_info['fs']['total']['total_in_bytes'] / (1024 ** 3),
'disk_usage_percent': 100 - (node_info['fs']['total']['available_in_bytes'] * 100 / node_info['fs']['total']['total_in_bytes'])
}
return metrics
def log_metrics():
metrics = get_cluster_metrics()
timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
with open(LOG_FILE, 'a') as f:
f.write(f"Timestamp: {timestamp}
")
f.write(json.dumps(metrics, indent=4))
f.write('
')
if __name__ == "__main__":
log_metrics()
print("Elasticsearch cluster metrics logged successfully.")2. Scheduling with cron
Place the script at /home/user/scripts/es_metrics.py and add the following line to crontab -e to run daily at 06:00:
0 6 * * * /usr/bin/python3 /home/user/scripts/es_metrics.py >> /home/user/scripts/es_metrics_cron.log 2>&1Conclusion
The presented deep‑inspection checklist enables comprehensive monitoring of each EFLK component—Elasticsearch health, node performance, shard allocation, index status; Filebeat, Logstash, and Kibana service health, configuration, and logs. Regular manual checks combined with automated Python‑driven metrics collection and cron scheduling help detect issues early and keep the stack running smoothly.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Mingyi World Elasticsearch
The leading WeChat public account for Elasticsearch fundamentals, advanced topics, and hands‑on practice. Join us to dive deep into the ELK Stack (Elasticsearch, Logstash, Kibana, Beats).
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
