Master ELK Stack: From Basics to Full‑Scale Log Management
This article introduces the ELK stack components, explains why centralized logging is essential, outlines core log‑system features, compares three ELK architectures, provides step‑by‑step installation and configuration for Filebeat, Logstash, Elasticsearch and Kibana, and demonstrates a complete pipeline using Kafka with code examples and diagrams.
ELK Overview
ELK is the abbreviation for Elasticsearch, Logstash, and Kibana, the three core open‑source components of the Elastic Stack. Filebeat, a lightweight shipper from the Beats family, can replace Logstash for simple data collection.
Filebeat forwards and centralizes log data by monitoring specified log files, reading new content, and sending events to Logstash or Elasticsearch.
Logstash is a free, open‑source server‑side data processing pipeline that can ingest data from multiple sources, transform it (e.g., using Grok), and output it to a chosen repository.
Elasticsearch is a distributed search and analytics engine built on Lucene, providing near‑real‑time search and analysis for structured, unstructured, numeric, and geospatial data.
Kibana is an open‑source analytics and visualization platform for Elasticsearch, offering dashboards, charts, and a web UI for exploring log data.
Why Use ELK
System, application, and security logs help operators understand hardware status, detect configuration errors, and monitor performance and security. Centralized log management becomes essential when dealing with dozens or hundreds of servers, as traditional tools like grep or awk become inefficient.
A distributed architecture with multiple services across servers benefits from a centralized logging system to quickly locate issues.
Basic Features of a Complete Log System
Collection: ability to gather logs from various sources.
Transport: stable parsing, filtering, and forwarding to storage.
Storage: persisting log data.
Analysis: UI‑based analysis support.
Alerting: error reporting and monitoring mechanisms.
ELK Architecture Analysis
Beats + Elasticsearch + Kibana
This simple entry‑level architecture uses Filebeat (or other Beats) to ship logs directly to Elasticsearch, with Kibana for visualization. Suitable for small log volumes; production environments usually add Logstash.
Beats + Logstash + Elasticsearch + Kibana
Introducing Logstash adds disk‑based buffering, multi‑source ingestion, flexible output destinations (e.g., S3, HDFS), and conditional pipelines, improving scalability and reliability.
Filebeat + Logstash advantages:
Horizontal scalability, high availability, and load balancing.
Message durability with at‑least‑once delivery guarantees.
End‑to‑end encrypted transport with TLS and authentication.
Additional input methods such as TCP, UDP, and HTTP can be configured for Logstash.
Beats + Cache/Message Queue + Logstash + Elasticsearch + Kibana
Adding a middleware like Redis, Kafka, or RabbitMQ between Beats and Logstash reduces load on log‑generating machines and buffers data to protect Elasticsearch from write spikes.
ELK Deployment
Filebeat Installation
Principle
Filebeat starts one or more inputs, each scanning configured log paths. For each discovered file, a harvester reads new lines and forwards events to the configured output.
Simple Installation
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.7.0-linux-x86_64.tar.gz
tar -xzvf filebeat-7.7.0-linux-x86_64.tar.gzConfiguration file: filebeat.yml (example excerpt). Start with ./filebeat -e.
Logstash Installation
Basic Principle
Logstash pipelines consist of inputs → filters → outputs. Each stage runs in its own thread, processing events from an internal queue.
Simple Installation
curl -L -O https://artifacts.elastic.co/downloads/logstash/logstash-7.7.0.tar.gz
tar -zxvf logstash-7.7.0.tar.gzExample HelloWorld command: ./bin/logstash -e 'input { stdin { } } output { stdout {} }'.
Elasticsearch Installation
Basic Overview
Elasticsearch is a distributed document store and search engine built on Lucene, supporting near‑real‑time search, horizontal scaling, and RESTful APIs.
Linux System Settings
ulimit -n 65535
swapoff -a
sysctl -w vm.max_map_count=262144
# additional ulimit and sysctl tweaks...Installation Steps
groupadd elastic
useradd elk -d /data/hd05/elk -g elastic
echo 'password' | passwd elk --stdin
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.7.0-linux-x86_64.tar.gz
tar -zxvf elasticsearch-7.7.0-linux-x86_64.tar.gz
ln -s elasticsearch-7.7.0 esKey directories: $ES_HOME, bin, conf, data, logs, jdk, plugins, lib, modules.
Configure JVM options in config/jvm.options and enable security, TLS, and memory lock in elasticsearch.yml. Start with ./bin/elasticsearch -d and set built‑in passwords via ./bin/elasticsearch-setup-passwords interactive.
Kibana Installation
wget https://artifacts.elastic.co/downloads/kibana/kibana-7.7.0-linux-x86_64.tar.gz
tar -zxvf kibana-7.7.0-linux-x86_64.tar.gzConfigure kibana.yml (server.port, host, Elasticsearch hosts, credentials) and start with ./bin/kibana. Access via http://<host>:5601.
Instance Analysis
Example pipeline: Beats → Kafka → Logstash → Elasticsearch → Kibana.
Filebeat configuration (excerpt) sends logs to Kafka:
output.kafka:
hosts: ["192.168.110.130:9092"]
topic: "filebeat_test"
compression: gzipLogstash pipeline (apache.conf) consumes from Kafka, parses Apache logs with the COMBINEDAPACHELOG pattern, and indexes into Elasticsearch:
input {
kafka {
bootstrap_servers => "192.168.110.130:9092"
topics => ["filebeat_test"]
group_id => "test123"
auto_offset_reset => "earliest"
}
}
filter {
json { source => "message" }
grok { match => { "message" => "%{COMBINEDAPACHELOG}" } remove_field => "message" }
}
output {
elasticsearch {
hosts => ["192.168.110.130:9200","192.168.110.131:9200"]
index => "test_kafka"
user => "elastic"
password => "${ES_PWD}"
}
stdout { codec => rubydebug }
}After starting Filebeat and Logstash, the indexed data can be viewed in Kibana dashboards.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
