Step-by-Step Guide to Building an ELK Stack with Kafka, Zookeeper, Logstash, and Filebeat for Log Collection
This tutorial provides a comprehensive, step-by-step procedure for setting up a log‑collection pipeline using Filebeat, Kafka, Zookeeper, Logstash, Elasticsearch, and Kibana across multiple servers, covering hardware preparation, system tuning, software installation, configuration files, and verification commands.
Workflow : Filebeat collects log files, forwards them to a Kafka cluster, Logstash consumes the Kafka messages, formats them, and stores them in Elasticsearch; Kibana visualizes the logs.
Hardware Requirements : Four servers are used; each must have JDK installed and environment variables configured.
System Tuning :
sudo vi /etc/profile</code>
<code>export JAVA_HOME=JDK安装路径</code>
<code>export PATH=$JAVA_HOME/bin:$PATH vim /etc/sysctl.conf</code>
<code>fs.file-max=65536</code>
<code>vm.max_map_count = 262144</code>
<code>vim /etc/security/limits.conf</code>
<code>* soft nofile 65535</code>
<code>* hard nofile 131072</code>
<code>* soft nproc 2048</code>
<code>* hard nproc 4096Software Versions : (image omitted)
Kafka & Zookeeper Installation
On servers 10.16.10.113, 10.16.10.114, and 10.16.8.187, install Kafka and disable the firewall:
systemctl stop firewalld systemctl status firewalldZookeeper (using Kafka's bundled Zookeeper):
vim config/zookeeper.properties clientPort=2181</code>
<code>maxClientCnxns=100</code>
<code>tickTime=2000</code>
<code>initLimit=10</code>
<code>syncLimit=5</code>
<code>dataDir=/usr/local/kafka/zookeeper/data</code>
<code>dataLogDir=/usr/local/kafka/zookeeper/log</code>
<code>server.1=10.16.10.113:12888:13888</code>
<code>server.2=10.16.10.114:12888:13888</code>
<code>server.3=10.16.8.187:12888:13888Create a myid file in each server’s dataDir matching the server number.
Kafka Broker configuration:
vim config/server.properties broker.id=1</code>
<code>prot = 9092</code>
<code>host.name = 10.16.10.113</code>
<code>num.network.threads=3</code>
<code>num.io.threads=8</code>
<code>socket.send.buffer.bytes=102400</code>
<code>socket.receive.buffer.bytes=102400</code>
<code>socket.request.max.bytes=104857600</code>
<code>log.dirs=/usr/local/kafka-logs</code>
<code>num.partitions=16</code>
<code>num.recovery.threads.per.data.dir=1</code>
<code>offsets.topic.replication.factor=1</code>
<code>transaction.state.log.replication.factor=1</code>
<code>transaction.state.log.min.isr=1</code>
<code>log.retention.hours=168</code>
<code>log.segment.bytes=1073741824</code>
<code>log.retention.check.interval.ms=300000</code>
<code>zookeeper.connect=10.16.10.113:2181,10.16.10.114:2181,10.16.8.187:2181</code>
<code>zookeeper.connection.timeout.ms=6000</code>
<code>group.initial.rebalance.delay.ms=0Start Zookeeper and Kafka:
nohup sh zookeeper-server-start ../config/zookeeper.properties & nohup sh kafka-server-start ../config/server.properties &Create and test a topic:
/usr/local/kafka/bin/kafka-topics.sh --create --zookeeper 10.16.10.113:2181,10.16.10.114:2181,10.16.8.187:2181 --replication-factor 1 --partitions 2 --topic testtopic /usr/local/kafka/bin/kafka-topics.sh --zookeeper 10.16.10.113:2181,10.16.10.114:2181,10.16.8.187:2181 --list /usr/local/kafka/bin/kafka-console-producer.sh --broker-list 10.16.10.113:9092 --topic testtopic /usr/local/kafka/bin/kafka-console-consumer.sh --bootstrap-server 10.16.10.113:9092 --from-beginning --topic testtopicELK Stack Installation
Elasticsearch on 10.16.10.113, 10.16.10.114, 10.16.3.165 (master node 10.16.3.165). Edit elasticsearch.yml:
cluster.name: elkmaster</code>
<code>node.name: 10.16.3.165</code>
<code>node.master: true</code>
<code>path.logs: /usr/local/data/log/</code>
<code>network.host: 10.16.3.165</code>
<code>http.port: 9200</code>
<code>discovery.zen.ping.unicast.hosts: ["10.16.10.113","10.16.10.114"]</code>
<code>cluster.initial_master_nodes: ["10.16.3.165"]Other nodes set node.master: false and adjust cluster.name, node.name, and network.host.
Kibana on 10.16.3.165. Edit kibana.yml:
server.port: 5601</code>
<code>server.host: "10.16.3.165"</code>
<code>elasticsearch.hosts: ["http://10.16.3.165:9200"]</code>
<code>i18n.locale: "zh-CN"Start services as non‑root:
nohup sh elasticsearch &</code>
<code>/bin/elasticsearch -d</code>
<code>nohup sh kibana &Verify by accessing http://10.16.3.165:9200 (Elasticsearch) and http://10.16.3.165:5601 (Kibana).
Filebeat Installation
On server 10.16.3.166, edit filebeat.yml to read logs and output to Kafka:
filebeat.inputs:</code>
<code>- type: log</code>
<code> enabled: true</code>
<code> paths:</code>
<code> - /data/home/app/domains/cpay_domain/logs/cpay-tms-gate.log</code>
<code>output.kafka:</code>
<code> enable: true</code>
<code> hosts: ["10.16.8.187:9092"]</code>
<code> topic: es-tmslogs</code>
<code> compression: gzip</code>
<code> max_message_bytes: 100000Start Filebeat:
./filebeat -e -c filebeat.ymlLogstash Installation
On 10.16.3.165, create logstashfortms.conf:
input{</code>
<code> kafka{</code>
<code> bootstrap_servers => "10.16.10.113:9092,10.16.10.114:9092,10.16.8.187:9092"</code>
<code> topics => ["es-tmslogs"]</code>
<code> codec => json</code>
<code> }</code>
<code>}</code>
<code>output{</code>
<code> elasticsearch {</code>
<code> hosts => ["10.16.3.165:9200"]</code>
<code> index => "logstash-%{+YYYY.MM.dd}"</code>
<code> }</code>
<code>}Start Logstash:
nohup sh logstash -f ../config/logesforcpay.conf &Kibana Page Operations
After Kibana is running, open http://10.16.3.165:5601, create an index pattern, and explore the visualized logs.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Big Data Technology & Architecture
Wang Zhiwu, a big data expert, dedicated to sharing big data technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
