Operations 8 min read

Practical Thoughts on Applying ELK for Log Monitoring

This article shares the author's experience and lessons learned while building a log‑monitoring framework with the ELK stack, discussing performance issues, configuration of Logstash filters using Grok, and practical tips for deploying ElasticSearch, Logstash, and Kibana in production environments.

360 Quality & Efficiency

May 13, 2016

Practical Thoughts on Applying ELK for Log Monitoring

Before May Day, I wanted to summarize ELK practice ideas, but various reasons postponed it. Recently I organized ELK materials, revisited previously vague parts, and now the practice ideas are clearer; I can mentally connect the whole framework, which counts as a relatively shallow entry.

Although the title of this article is about thinking in actual applications, I have only deployed this framework in demos and have hit many pitfalls; now I will share a simple overview of ELK learning.

Some readers may not know what ELK is; ELK stands for ElasticSearch + Logstash + Kibana. How to download and install these components is not the focus of this article; installation steps are clearly described on the Elastic official website and can be done in three steps, so I won't elaborate. If you now need hand‑holding to install software on Linux, you may need to reflect on your own searching skills.

Before discussing specific applications, let me describe the log‑monitoring framework I previously built, where all parts were written by myself without using open‑source technologies. The reason for not using open‑source was not that they failed to meet business requirements, but simply because I was unaware of the industry and existing open‑source log‑monitoring solutions, so I could not consider how to apply them.

In daily learning, many people, like me, reinvent the wheel repeatedly. But is reinventing the wheel really necessary? I think the necessity can be divided into three demand levels:

First level: With existing wheels, see if they can be used well to solve the problem.

Second level: If existing wheels cannot meet the requirements, consider developing extensions so the wheel can run properly and solve the problem.

Third level: When the industry lacks advanced, mature solutions for the problem, then we must create a wheel suitable for current business needs based on prior research.

In my previous log‑monitoring framework, I did not consider these levels, resulting in issues: performance problems and suitability of online environment validation.

1. Performance issue: The current rule validation is separated from data collection and deployed on an Nginx server. Nginx cannot meet the required QPS under high concurrency, so the framework is only suitable for test environments.

2. Online validation issue: In test environments we may focus on specific field values (e.g., flag can be 1,2,3). In production, dirty data makes monitoring specific field values less meaningful; even if illegal values are detected, there is no effective solution. Developers care more about year‑over‑year and month‑over‑month metrics, which are not considered in the current framework. These reasons make redeploying an online log‑monitoring framework necessary, which is why I consider the ELK stack.

Now the remaining question is how to use the existing ELK stack to meet current business needs? My previous research also focused on this, especially Logstash configuration.

Logstash: its role is to collect relevant logs. Its configuration file consists of three parts: Input, Filter, Output. Input plugins specify data sources, Filter plugins filter raw data, Output plugins send filtered data to destinations such as files, email, or ElasticSearch.

The difficulty lies in the Filter plugins, because all filtering, adding, deleting of raw data is done in the Filter, specifically in Grok. Since my previous framework used regex filtering, I also use the Grok plugin here. The rule is similar to:

1. Collect and dissect data

For those not familiar with regex matching, you can debug patterns at Grok Debug (https://grokdebug.herokuapp.com/).

2. Based on captured data, perform validation and add tag flags, as shown below:

tag_on_failure and add_tag are used to handle matched results; refer to the official guide for field meanings. Now the filtered data can be output; if an alarm is needed, output to Email. In the demo, output is set to ElasticSearch, with Output configured as follows:

After data is sent to ElasticSearch, it can be viewed via RESTful API or Kibana; further details are omitted. One point to note: if ElasticSearch uses logstash-@timestamp as the index, you must set a reasonable time range in Kibana's Discover, otherwise no data will appear.

In the demo, the info/error percentages are as follows:

Further discussion on ElasticSearch performance optimization will be covered later.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Operations Elasticsearch ELK Log Monitoring Logstash Kibana

Written by

360 Quality & Efficiency

360 Quality & Efficiency focuses on seamlessly integrating quality and efficiency in R&D, sharing 360’s internal best practices with industry peers to foster collaboration among Chinese enterprises and drive greater efficiency value.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.