How to Build a Real‑Time ELK Log Analysis Platform for Scalable Operations
This article explains why centralized logging is essential for modern micro‑service systems, outlines an ELK‑based architecture with Filebeat, Kafka, Logstash, Elasticsearch and Kibana, and provides detailed configuration examples for both file‑based and Kubernetes‑based log collection, plus visualization techniques.
Logs are a crucial reference for handling production incidents, performance tuning, and business analysis, and become increasingly difficult to manage as micro‑service architectures grow, making a unified log center essential.
1. Architecture Design
ELK consists of three components: Elasticsearch (distributed search engine for collecting, analyzing, and storing data), Logstash (log collection and filtering, though it can be performance‑heavy), and Kibana (visual interface for searching and visualizing logs). Beats, especially Filebeat, are commonly used for lightweight log collection.
This architecture suits high‑concurrency production log collection.
Collection side : Use lightweight Filebeat to gather logs from servers, containers, and applications in real time.
Message queue : Introduce Kafka to handle I/O bottlenecks and improve scalability under high‑concurrency scenarios.
Processing side : Logstash consumes Kafka messages, filters and analyzes logs, then forwards them to the ES cluster.
Storage : Elasticsearch stores processed logs, using index templates for different log types, sharding, and replication, and provides APIs for queries.
Visualization side : Kibana queries Elasticsearch and presents logs via dashboards, tables, maps, etc.
2. Log Collection
Log collection types include system logs, service logs (e.g., database logs, error logs, slow queries), and business logs (commonly Java Log4j).
System logs : include messages, secure logs, etc.
Service logs : database operation logs, error logs, slow query logs.
Business logs : core application logs, often generated by Log4j.
Collection methods:
1) File‑based collection
filebeat.ymlcore configuration example:
<code>filebeat.inputs:
- type: log
enabled: false
paths:
- /tmp/*.log
tags: ["sit","uat"]
fields:
role: "云原生运维"
date: "202308"
- type: log
enabled: true
paths:
- /var/log/*.log
tags: ["SRE","team"]
---------------------------
output.elasticsearch:
enabled: true
hosts: ["192.168.0.1:9200","192.168.0.2:9200","192.168.0.3:9200"]
index: "cmdi-linux-sys-%{+yyyy.MM.dd}"
setup.ilm.enabled: false
setup.template.name: "dev-linux"
setup.template.pattern: "dev-linux*"
setup.template.overwrite: false
setup.template.settings:
index.number_of_shards: 3
index.number_of_replicas: 2</code>Configuration notes:
typeidentifies log type,
enabledtoggles collection,
pathssets file patterns,
tagsadd labels, and
output.elasticsearchconfigures storage with index templates, shards, and replicas.
2) Kubernetes‑based collection
To handle dynamic Pods, a container‑aware Filebeat setup is required.
Step 1 – Create ServiceAccount:
<code>apiVersion: v1
kind: ServiceAccount
metadata:
name: filebeat
namespace: kube-system
labels:
app: filebeat</code>Step 2 – RBAC role binding:
<code>apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: filebeat
subjects:
- kind: ServiceAccount
name: filebeat
namespace: kube-system
roleRef:
kind: ClusterRole
name: filebeat
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: filebeat
namespace: kube-system
subjects:
- kind: ServiceAccount
name: filebeat
namespace: kube-system
roleRef:
kind: Role
name: filebeat
apiGroup: rbac.authorization.k8s.io</code>Step 3 – ConfigMap for Filebeat:
<code>data:
filebeat.yml: |-
filebeat.inputs:
- type: container
paths:
- /var/log/containers/*.log
processors:
- add_kubernetes_metadata:
in_cluster: true
matchers:
- logs_path:
logs_path: "/log/containers/"
- drop_event.when.not:
or:
- equals.kubernetes.namespace: sit-dev
output.elasticsearch:
hosts: ['192.168.0.1:9200', '192.168.0.2:9200', '192.168.0.3:9200']
index: "sit-%{[kubernetes.container.name]:default}-%{+yyyy.MM.dd}"
setup.template.name: "sit"
setup.template.pattern: "sit-*"</code>Step 4 – Deploy DaemonSet:
<code>containers:
- name: filebeat
image: elastic/filebeat:v8.6.2
args: ["-c", "/etc/filebeat.yml", "-e"]
env:
- name: ELASTICSEARCH_HOST
value: 192.168.0.1
- name: ELASTICSEARCH_PORT
value: "9200"
securityContext:
runAsUser: 0
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi</code>After deploying these containers, logs are collected and indexed.
3. Visualization
Once the collection service is running, Kibana connects to Elasticsearch to query log indices, separating logs by time‑based indices.
Create data views for each index type; the view name is user‑defined.
Index patterns use regular expressions to match specific indices, enabling focused data views.
Visualizations support flexible field combinations, KQL queries, historical log retrieval, and custom refresh intervals.
Multiple dashboard templates and custom dashboards are available.
4. Summary
ELK provides comprehensive log collection, storage, analysis, and visualization capabilities; Elasticsearch’s full‑text indexing enables fast searches over billions of records, and its scalable clustering suits production‑grade centralized logging.
However, ELK’s native log format handling is limited, requiring auxiliary components for preprocessing in some scenarios, and it has constraints in alerting, permission management, and correlation analysis, necessitating ongoing optimization.
With continued open‑source community development, the ELK stack is expected to mature further and address more use cases.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.