Big Data 16 min read

Mastering ELK on Kubernetes: Step‑by‑Step Helm3 Deployment and Log Management

This guide walks through the ELK stack components—Elasticsearch, Logstash, Kibana, and Filebeat—explains how to collect logs with Beats, integrate Kafka for buffering, and provides detailed Helm3 installation procedures for each service on Kubernetes, plus backup and restore strategies.

MaGe Linux Operations

Sep 17, 2022

Mastering ELK on Kubernetes: Step‑by‑Step Helm3 Deployment and Log Management

Overview

ELK stands for Elasticsearch, Logstash, and Kibana, all open‑source. Filebeat is a lightweight log‑shipping agent that consumes minimal resources and forwards logs to Logstash.

The typical workflow includes Elasticsearch for storage, Filebeat for log collection, Kafka for buffering, Logstash for filtering, and Kibana for visualization.

Elasticsearch Storage

Elasticsearch is a distributed search engine offering collection, analysis, and storage of data with features such as zero‑configuration, automatic sharding, replica mechanisms, and a RESTful API.

Filebeat Log Collection

Filebeat belongs to the Beats family, a set of lightweight log shippers. Compared with Logstash, Beats consume far less CPU and memory.

Packetbeat : network traffic

Metricbeat : system metrics

Filebeat : file logs

Winlogbeat : Windows event logs

Auditbeat : audit data

Heartbeat : uptime monitoring

Kafka Integration

Kafka helps smooth traffic spikes; it is preferred over Redis for reliable message queuing in ELK pipelines.

Logstash Filtering

Logstash collects, analyzes, and filters logs, supporting a client‑server model where the client runs on log‑generating hosts and the server forwards processed events to Elasticsearch.

Scalability

Elasticity

Filtering capabilities

Kibana Visualization

Kibana provides a web UI for searching, analyzing, and visualizing logs stored in Elasticsearch and processed by Logstash.

Helm3 Installation of ELK Components

1. Prerequisites

helm repo add elastic https://helm.elastic.co

2. Install Elasticsearch

# my-values.yaml
clusterName: "elasticsearch"
esConfig:
  elasticsearch.yml: |
    network.host: 0.0.0.0
    cluster.name: "elasticsearch"
    xpack.security.enabled: false
resources:
  requests:
    memory: 1Gi
volumeClaimTemplate:
  storageClassName: "bigdata-nfs-storage"
  accessModes: ["ReadWriteOnce"]
  resources:
    requests:
      storage: 3Gi
service:
  type: NodePort
  port: 9000
  nodePort: 31311

helm install es elastic/elasticsearch -f my-values.yaml --namespace bigdata

3. Install Kibana

# my-values.yaml
kibanaConfig:
  kibana.yml: |
    server.port: 5601
    server.host: "0.0.0.0"
    elasticsearch.hosts: ["elasticsearch-master-headless.bigdata.svc.cluster.local:9200"]
resources:
  requests:
    cpu: "1000m"
    memory: "256Mi"
  limits:
    cpu: "1000m"
    memory: "1Gi"
service:
  type: NodePort
  port: 5601
  nodePort: "30026"

helm install kibana elastic/kibana -f my-values.yaml --namespace bigdata

4. Install Filebeat

# my-values.yaml
daemonset:
  filebeatConfig:
    filebeat.yml: |
      filebeat.inputs:
      - type: container
        paths:
        - /var/log/containers/*.log
      output.elasticsearch:
        enabled: false
      output.kafka:
        enabled: true
        hosts: ["kafka-headless.bigdata.svc.cluster.local:9092"]
        topic: test

helm install filebeat elastic/filebeat -f my-values.yaml --namespace bigdata

5. Install Logstash

# my-values.yaml
logstashConfig:
  logstash.yml: |
    xpack.monitoring.enabled: false
logstashPipeline:
  logstash.yml: |
    input {
      kafka {
        bootstrap_servers => "kafka-headless.bigdata.svc.cluster.local:9092"
        topics => ["test"]
        group_id => "mygroup"
        consumer_threads => 1
        decorate_events => true
        auto_offset_reset => "earliest"
      }
    }
    filter {
      mutate {
        split => ["[@metadata][kafka][key]", ","]
        add_field => {"index" => "%{[@metadata][kafka][key][0]}"}
      }
    }
    output {
      elasticsearch {
        hosts => ["elasticsearch-master-headless.bigdata.svc.cluster.local:9200"]
        index => "test-%{+YYYY.MM.dd}"
      }
    }
resources:
  requests:
    cpu: "100m"
    memory: "256Mi"
  limits:
    cpu: "1000m"
    memory: "1Gi"
volumeClaimTemplate:
  accessModes: ["ReadWriteOnce"]
  resources:
    requests:
      storage: 3Gi

helm install logstash elastic/logstash -f my-values.yaml --namespace bigdata

ELK Backup and Restore

1. Elasticsearch Snapshot

# elasticsearch.yml
path.repo: ["/mount/backups", "/mount/longterm_backups"]

PUT /_snapshot/my_backup
{
  "type": "fs",
  "settings": {"location": "/mount/backups/my_backup"}
}

PUT /_snapshot/my_backup/snapshot_1?wait_for_completion=true

PUT /_snapshot/my_backup/snapshot_2?wait_for_completion=true

POST /_snapshot/my_backup/snapshot_1/_restore?wait_for_completion=true
{
  "indices": "index_1",
  "rename_replacement": "restored_index_1"
}

2. elasticdump

# Export mapping
elasticdump \
  --input=http://<em>es_ip</em>:9200/index_name/index_type \
  --output=/data/my_index_mapping.json \
  --type=mapping
# Export data
elasticdump \
  --input=http://<em>es_ip</em>:9200/index_name/index_type \
  --output=/data/my_index.json \
  --type=data

# Import mapping
elasticdump \
  --output=http://<em>es_ip</em>:9200/index_name \
  --input=/home/indexdata/roll_vote_mapping.json \
  --type=mapping
# Import data
elasticdump \
  --output=http://<em>es_ip</em>:9200/index_name \
  --input=/home/indexdata/roll_vote.json \
  --type=data

3. esm

# Backup
esm -s http://10.33.8.103:9201 -x "petition_data" -b 5 --count=5000 --sliced_scroll_size=10 --refresh -o=./es_backup.bin
# Restore
esm -d http://172.16.20.20:9201 -y "petition_data6" -c 5000 -b 5 --refresh -i=./dump.bin

Additional Note

A similar log architecture replaces Filebeat with Flume and Logstash with Flink; a future article will cover that design.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

kubernetes backup ELK Logstash Kibana helm filebeat

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.