Master ELK: Build a Scalable Log Management System with Elasticsearch, Logstash, Kibana
This guide introduces the ELK stack (Elasticsearch, Logstash, Kibana, and Filebeat), explains why centralized log management is essential, details the architecture options, and provides step‑by‑step installation and configuration instructions—including a Kafka‑backed pipeline—to help you deploy a production‑grade logging solution.
1.1 ELK Overview
ELK is the acronym for the three open‑source frameworks Elasticsearch, Logstash, and Kibana (together often called the Elastic Stack). Filebeat, a lightweight Beats component, can replace Logstash for data collection.
Filebeat forwards and centralizes log data. It monitors specified log files, reads new entries, and ships events to Elasticsearch or Logstash for indexing.
Logstash is a free, open‑source server‑side data‑processing pipeline that can ingest data from multiple sources, transform it, and forward it to your chosen storage.
Elasticsearch is the distributed search and analytics engine at the core of the Elastic Stack. Built on Lucene, it provides near‑real‑time search and analysis for structured, unstructured, numeric, and geospatial data.
Kibana is an open‑source analytics and visualization platform for Elasticsearch. It offers dashboards, charts, and a web UI for exploring and visualizing indexed data.
1.2 Why Use ELK
Logs (system, application, security) give operators insight into server health, configuration errors, and performance. Centralized log management simplifies collection, storage, and analysis across dozens or hundreds of machines, improving troubleshooting efficiency.
1.3 Core Features of a Complete Log System
Collection: gather logs from diverse sources.
Transport: reliably parse, filter, and forward logs to storage.
Storage: persist log data.
Analysis: provide UI‑based analytics.
Alerting: generate error reports and monitoring alerts.
2 ELK Architecture Analysis
2.1 Beats + Elasticsearch + Kibana (Simple)
This basic stack consists of Beats (typically Filebeat) for log shipping, Elasticsearch for storage/search, and Kibana for visualization. Suitable for simple log data and testing; production environments should add Logstash.
2.2 Beats + Logstash + Elasticsearch + Kibana
Adding Logstash brings:
Disk‑based adaptive buffering to absorb bursts.
Ability to ingest from databases, S3, message queues, etc.
Multi‑destination output (e.g., S3, HDFS, files).
Conditional pipeline logic for complex processing.
Filebeat + Logstash advantages include horizontal scalability, high availability, at‑least‑once delivery guarantees, and end‑to‑end encrypted transport (TLS, basic auth, LDAP, etc.).
2.3 Beats + Cache/MQ + Logstash + Elasticsearch + Kibana
Introducing a middleware layer (Redis, Kafka, RabbitMQ) between Beats and Logstash reduces load on log‑generating servers, buffers data to protect Elasticsearch from write spikes, and centralizes formatting and processing.
3 ELK Deployment
3.1 Installing Filebeat
3.1.1 Principle
Filebeat starts one or more inputs that watch specified log locations. For each discovered log, a harvester reads new lines and forwards events to libbeat, which then ships them to the configured output.
3.1.2 Simple Installation
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.7.0-linux-x86_64.tar.gz
tar -xzvf filebeat-7.7.0-linux-x86_64.tar.gzConfiguration example: filebeat.reference.yml (contains all non‑deprecated options). Use filebeat.yml for your settings and start with ./filebeat -e.
3.2 Installing Logstash
3.2.1 Basic Principle
Logstash pipelines consist of mandatory inputs, optional filters, and mandatory outputs. Each input runs in its own thread, feeding events into an internal queue; filters process the events; outputs write them to the destination.
3.2.2 Simple Installation
Download: Logstash download page (or the Chinese mirror). Ensure JDK is available (Logstash 7 ships its own JDK). tar -zxvf logstash-7.7.0.tar.gz Test with a HelloWorld pipeline:
./bin/logstash -e 'input { stdin { } } output { stdout {} }'3.3 Installing Elasticsearch
3.3.1 Overview
Elasticsearch is a distributed, RESTful search and analytics engine built on Lucene. It supports horizontal scaling, full‑text search, near‑real‑time analytics, high availability, dynamic mapping, and a JSON‑over‑HTTP API.
3.3.2 Linux System Settings
ulimit -n 65535 # temporary file descriptor limit
echo "* soft nofile 65535" >> /etc/security/limits.conf # permanent
sysctl -w vm.max_map_count=262144 # required for ES
sysctl -w vm.swappiness=1 # optional swap tuning3.3.3 Elasticsearch Installation
groupadd elastic
useradd elk -d /data/hd05/elk -g elastic
echo "2edseoir@" | passwd elk --stdin
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.7.0-linux-x86_64.tar.gz
tar -zxvf elasticsearch-7.7.0-linux-x86_64.tar.gz
ln -s elasticsearch-7.7.0 esConfigure elasticsearch.yml (cluster name, node roles, paths, network host, discovery hosts, security settings, etc.). Start with ./bin/elasticsearch -d (daemon) or without -d for foreground.
3.3.4 Setting Up Passwords
./bin/elasticsearch-setup-passwords interactiveFollow the prompts to set passwords for built‑in users (elastic, kibana, logstash_system, etc.).
3.4 Installing Kibana
Download from the Elastic website, extract, and edit kibana.yml:
server.port: 5601
server.host: "0.0.0.0"
elasticsearch.hosts: ["http://192.168.110.130:9200", "http://192.168.110.131:9200", "http://192.168.110.132:9200"]
elasticsearch.username: "elastic"
elasticsearch.password: "password"Start with ./bin/kibana and access http://192.168.110.130:5601/ using the credentials set earlier.
4 Example Pipeline
We build a pipeline: Beats → Kafka (as buffer) → Logstash → Elasticsearch → Kibana.
4.1 Filebeat Configuration (Kafka output)
filebeat.inputs:
- type: log
enabled: true
paths:
- /data/elk/logstash-tutorial.log
output.kafka:
hosts: ["192.168.110.130:9092"]
topic: 'filebeat_test'
compression: gzip
required_acks: 1Start Filebeat in background:
cd filebeat-7.7.0-linux-x86_64 && nohup ./filebeat -e &4.2 Logstash Configuration (Apache log parsing)
input {
kafka {
bootstrap_servers => "192.168.110.130:9092"
topics => ["filebeat_test"]
group_id => "test123"
auto_offset_reset => "earliest"
}
}
filter {
json { source => "message" }
grok { match => { "message" => "%{COMBINEDAPACHELOG}" } remove_field => "message" }
}
output {
stdout { codec => rubydebug }
elasticsearch {
hosts => ["192.168.110.130:9200", "192.168.110.131:9200", "192.168.110.132:9200"]
index => "test_kafka"
user => "elastic"
password => "${ES_PWD}"
}
}Run Logstash:
cd logstash-7.7.0 && nohup ./bin/logstash -f conf.d/apache.conf &4.3 Verify in Elasticsearch and Kibana
Use the Elasticsearch API (e.g., curl http://192.168.110.130:9200/_cat/indices?v) to confirm the test_kafka index exists, then open Kibana to explore the data via dashboards.
Original source: https://www.cnblogs.com/zsql/p/13164414.html
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Linux
Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
