Mastering ELK: Compare Three Log Collection Architectures and Solve Common Issues
This article introduces the ELK stack’s core components, compares three typical deployment architectures—including Logstash‑only, Filebeat‑assisted, and Kafka‑backed designs—highlights their trade‑offs, and provides practical solutions for multiline log merging, timestamp correction, and module‑specific filtering using Logstash and Filebeat configurations.
Overview
ELK has become the most popular centralized logging solution, consisting of Beats, Logstash, Elasticsearch, and Kibana to provide real‑time log collection, storage, and visualization.
Filebeat: a lightweight data collector that can replace Logstash on application servers and output to Kafka, Redis, etc.
Logstash: a heavier data collection engine with many plugins, supporting rich data sources and providing filtering, analysis, and formatting.
Elasticsearch: a distributed search engine built on Apache Lucene, offering centralized storage, analysis, and powerful search/aggregation.
Kibana: a web‑based visualization platform for real‑time viewing of Elasticsearch data with rich charting.
ELK Common Deployment Architectures
2.1 Logstash as Log Collector
This original architecture deploys a Logstash instance on each application server to collect, filter, and format logs before sending them to Elasticsearch for storage and Kibana for visualization. The drawback is high resource consumption on the application servers.
2.2 Filebeat as Log Collector
This architecture replaces Logstash with Filebeat on the application side. Filebeat is lightweight and typically works together with Logstash, making it the most commonly used deployment.
2.3 Adding a Caching Queue
Based on the second architecture, a message queue such as Kafka is introduced. Filebeat sends data to Kafka, and Logstash reads from Kafka, addressing high‑volume log collection, data safety, and load balancing between Logstash and Elasticsearch.
2.4 Summary of the Three Architectures
The first architecture is rarely used due to resource consumption. The second is the most popular. The third, involving a message queue, is only needed for very large data volumes where back‑pressure handling is required.
Problems and Solutions
Multiline Log Merging
Logs that span multiple lines need to be merged. Use the multiline plugin in Filebeat or Logstash.
Configuration in Filebeat:
filebeat.prospectors:
-
paths:
- /home/project/elk/logs/test.log
input_type: log
multiline:
pattern: '^\['
negate: true
match: after
output:
logstash:
hosts: ["localhost:5044"]Key parameters:
pattern: regular expression
negate: true merges lines that do NOT match the pattern
match: after merges to the end of the previous line
Configuration in Logstash:
input {
beats {
port => 5044
}
}
filter {
multiline {
pattern => "%{LOGLEVEL}\s*]"
negate => true
what => "previous"
}
}
output {
elasticsearch {
hosts => "localhost:9200"
}
}(1) Logstash’s what value previous corresponds to Filebeat’s after . (2) The %{LOGLEVEL} pattern is a predefined Logstash regex; many others are available in the official patterns repository.
Replacing Kibana’s @timestamp with Log Time
By default Kibana shows the ingestion time. Use the grok and date filters to extract the timestamp from the log message and replace @timestamp.
filter {
multiline {
pattern => "%{LOGLEVEL}\s*]\[%{YEAR}%{MONTHNUM}%{MONTHDAY}\s+%{TIME}]"
negate => true
what => "previous"
}
grok {
match => ["message", "(?<customer_time>%{YEAR}%{MONTHNUM}%{MONTHDAY}\s+%{TIME})"]
}
date {
match => ["customer_time", "yyyyMMdd HH:mm:ss,SSS"]
target => "@timestamp"
}
}Viewing Logs by System Module in Kibana
Add a field to identify the module or create separate Elasticsearch indices per module.
Example adding a log_from field in Filebeat:
filebeat.prospectors:
-
paths:
- /home/project/elk/logs/account.log
input_type: log
multiline:
pattern: '^\['
negate: true
match: after
fields:
log_from: account
-
paths:
- /home/project/elk/logs/customer.log
input_type: log
multiline:
pattern: '^\['
negate: true
match: after
fields:
log_from: customer
output:
logstash:
hosts: ["localhost:5044"]Or use document_type and route to different indices:
output {
elasticsearch {
hosts => "localhost:9200"
index => "%{type}"
}
}Conclusion
The article presented three ELK deployment architectures, highlighted their advantages and drawbacks, and offered practical solutions for common logging challenges such as multiline merging, timestamp correction, and module‑specific filtering. The Filebeat‑centric architecture is currently the most widely adopted, while adding a message queue is optional for very high‑volume scenarios.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
