Big Data 9 min read

Why Replace Logstash with Flink? Boost Log Processing Performance and Reliability

This article examines the shortcomings of Logstash in log collection—data loss, poor performance, high troubleshooting cost, and lack of dynamic scaling—and demonstrates how migrating to Flink can provide at‑least‑once semantics, flexible error handling, high‑throughput low‑latency processing, automatic resource scaling, and advanced analytics within the ELK ecosystem.

Volcano Engine Developer Services
Volcano Engine Developer Services
Volcano Engine Developer Services
Why Replace Logstash with Flink? Boost Log Processing Performance and Reliability

Case Overview

In a client log migration to Volcano Engine using the ELK stack, Logstash suffered frequent data loss and poor collection performance. The team replaced Logstash with Flink for log parsing, transformation, and writing to Elasticsearch, achieving over 1000+ k/s throughput during peak periods.

Logstash Introduction

ELK (Elasticsearch, Logstash, Kibana) provides a complete solution for log collection, parsing, querying, analysis, and visualization. Beats collect data and can write directly to Elasticsearch or pass through Logstash for further processing.

Logstash consists of three main parts:

Input plugins: read data from sources such as files, Beats, Kafka, etc.

Filter plugins: modify and process data, e.g., grok extracts fields, drop discards unwanted logs.

Output plugins: write processed data to destinations like Elasticsearch.

Logstash Pain Points

Data loss – Logstash uses an in‑memory buffer by default; on restart or crash, buffered data is lost. Persistent queues write to disk, but data can still be lost on node failure and introduce performance overhead.

High troubleshooting cost – Non‑standard JSON or other malformed logs require full‑chain investigation (collection, parsing, ES write). Even with dead‑letter queues, locating the root cause remains costly.

Poor collection and parsing performance – Plugins are implemented in Ruby and run on the JVM via JRuby, resulting in lower efficiency compared to native Java solutions. Enabling persistent queues further degrades performance due to frequent disk writes.

Lack of dynamic scaling – Logstash cannot automatically scale resources. In the case study, peak traffic was 24 × the low‑peak volume (100 w+ QPS vs. 50 k QPS), leading to significant resource waste during off‑peak periods.

Flink Advantages

At‑least‑once semantics – Flink uses distributed checkpoints and persists state to reliable storage (e.g., HDFS), guaranteeing at‑least‑once delivery. With exactly‑once support in Elasticsearch (via primary keys), Flink can achieve exactly‑once processing.

Flexible error handling – Invalid Kafka messages (e.g., non‑JSON) are routed to a separate Elasticsearch index with a reason tag. Failed ES writes are also captured with detailed error information, enabling easier governance of problematic logs.

High throughput, low latency – As a leading stream processing engine, Flink delivers second‑level latency and has been validated in complex Kafka processing scenarios.

Automatic resource scaling – Serverless Flink dynamically adjusts resources based on QPS, reducing costs during low‑traffic periods.

Advanced analytics – Beyond simple pattern matching, Flink supports event‑time processing, windowing, aggregation, deduplication, and can feed enriched data back to Elasticsearch for OLAP queries.

Flink vs. Logstash Summary

The comparison highlights Flink’s superior reliability, performance, scalability, and analytical capabilities over Logstash for high‑volume log processing within the ELK ecosystem.

References

https://elastic-stack.readthedocs.io/en/latest/introduction.html

https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-overview.html

https://www.elastic.co/guide/en/logstash/8.10/pipeline.html

https://www.elastic.co/guide/en/logstash/8.10/persistent-queues.html

http://thomaslau.xyz/2019/08/14/2019-08-14-on_logstash_quiz1/

FlinkELKLogstashLog ProcessingData Streaming
Volcano Engine Developer Services
Written by

Volcano Engine Developer Services

The Volcano Engine Developer Community, Volcano Engine's TOD community, connects the platform with developers, offering cutting-edge tech content and diverse events, nurturing a vibrant developer culture, and co-building an open-source ecosystem.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.