Operations 11 min read

Mastering ELK: Compare Three Log Collection Architectures and Solve Common Issues

This article introduces the ELK stack’s core components, compares three typical deployment architectures—including Logstash‑only, Filebeat‑assisted, and Kafka‑backed designs—highlights their trade‑offs, and provides practical solutions for multiline log merging, timestamp correction, and module‑specific filtering using Logstash and Filebeat configurations.

MaGe Linux Operations

Jan 19, 2023

Mastering ELK: Compare Three Log Collection Architectures and Solve Common Issues

Overview

ELK has become the most popular centralized logging solution, consisting of Beats, Logstash, Elasticsearch, and Kibana to provide real‑time log collection, storage, and visualization.

Filebeat: a lightweight data collector that can replace Logstash on application servers and output to Kafka, Redis, etc.

Logstash: a heavier data collection engine with many plugins, supporting rich data sources and providing filtering, analysis, and formatting.

Elasticsearch: a distributed search engine built on Apache Lucene, offering centralized storage, analysis, and powerful search/aggregation.

Kibana: a web‑based visualization platform for real‑time viewing of Elasticsearch data with rich charting.

ELK Common Deployment Architectures

2.1 Logstash as Log Collector

This original architecture deploys a Logstash instance on each application server to collect, filter, and format logs before sending them to Elasticsearch for storage and Kibana for visualization. The drawback is high resource consumption on the application servers.

2.2 Filebeat as Log Collector

This architecture replaces Logstash with Filebeat on the application side. Filebeat is lightweight and typically works together with Logstash, making it the most commonly used deployment.

2.3 Adding a Caching Queue

Based on the second architecture, a message queue such as Kafka is introduced. Filebeat sends data to Kafka, and Logstash reads from Kafka, addressing high‑volume log collection, data safety, and load balancing between Logstash and Elasticsearch.

2.4 Summary of the Three Architectures

The first architecture is rarely used due to resource consumption. The second is the most popular. The third, involving a message queue, is only needed for very large data volumes where back‑pressure handling is required.

Problems and Solutions

Multiline Log Merging

Logs that span multiple lines need to be merged. Use the multiline plugin in Filebeat or Logstash.

Configuration in Filebeat:

filebeat.prospectors:
  -
    paths:
      - /home/project/elk/logs/test.log
    input_type: log
    multiline:
      pattern: '^\['
      negate: true
      match: after
output:
  logstash:
    hosts: ["localhost:5044"]

Key parameters:

pattern: regular expression

negate: true merges lines that do NOT match the pattern

match: after merges to the end of the previous line

Configuration in Logstash:

input {
  beats {
    port => 5044
  }
}
filter {
  multiline {
    pattern => "%{LOGLEVEL}\s*]"
    negate => true
    what => "previous"
  }
}
output {
  elasticsearch {
    hosts => "localhost:9200"
  }
}

(1) Logstash’s what value previous corresponds to Filebeat’s after . (2) The %{LOGLEVEL} pattern is a predefined Logstash regex; many others are available in the official patterns repository.

Replacing Kibana’s @timestamp with Log Time

By default Kibana shows the ingestion time. Use the grok and date filters to extract the timestamp from the log message and replace @timestamp.

filter {
  multiline {
    pattern => "%{LOGLEVEL}\s*]\[%{YEAR}%{MONTHNUM}%{MONTHDAY}\s+%{TIME}]"
    negate => true
    what => "previous"
  }
  grok {
    match => ["message", "(?<customer_time>%{YEAR}%{MONTHNUM}%{MONTHDAY}\s+%{TIME})"]
  }
  date {
    match => ["customer_time", "yyyyMMdd HH:mm:ss,SSS"]
    target => "@timestamp"
  }
}

Viewing Logs by System Module in Kibana

Add a field to identify the module or create separate Elasticsearch indices per module.

Example adding a log_from field in Filebeat:

filebeat.prospectors:
  -
    paths:
      - /home/project/elk/logs/account.log
    input_type: log
    multiline:
      pattern: '^\['
      negate: true
      match: after
    fields:
      log_from: account
  -
    paths:
      - /home/project/elk/logs/customer.log
    input_type: log
    multiline:
      pattern: '^\['
      negate: true
      match: after
    fields:
      log_from: customer
output:
  logstash:
    hosts: ["localhost:5044"]

Or use document_type and route to different indices:

output {
  elasticsearch {
    hosts => "localhost:9200"
    index => "%{type}"
  }
}

Conclusion

The article presented three ELK deployment architectures, highlighted their advantages and drawbacks, and offered practical solutions for common logging challenges such as multiline merging, timestamp correction, and module‑specific filtering. The Filebeat‑centric architecture is currently the most widely adopted, while adding a message queue is optional for very high‑volume scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Elasticsearch ELK log management Logstash Kibana filebeat

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.