Operations 16 min read

Master ELK Log Processing: Encoding, Multiline, Grok, and Performance Tuning

This article compiles practical ELK knowledge, covering character‑set conversion, removing unwanted log lines, Grok pattern handling for multi‑line logs, multiline plugin usage in Filebeat and Logstash, date filtering, log type classification, performance optimization, Redis buffering, and Elasticsearch node tuning.

Efficient Ops
Efficient Ops
Efficient Ops
Master ELK Log Processing: Encoding, Multiline, Grok, and Performance Tuning

This article compiles practical ELK knowledge, covering character‑set conversion, removing unwanted log lines, Grok pattern handling for multi‑line logs, multiline plugin usage in Filebeat and Logstash, date filtering, log type classification, performance optimization, Redis buffering, and Elasticsearch node tuning.

1. ELK Practical Knowledge

1.1 Encoding Conversion

Problem: Chinese garbled characters (GB2312 to UTF‑8). Example codec configuration:

<code>codec => plain {
  charset => "GB2312"
}</code>

Filebeat can also perform the conversion:

<code>filebeat.prospectors:
- input_type: log
  paths:
  - C:/Users/Administrator/Desktop/performanceTrace.txt
  encoding: GB2312</code>

1.2 Deleting Unnecessary Log Lines

Use a Logstash drop filter to remove lines that match a pattern:

<code>if [message] =~ ~ "^20.*- task request,.*,start time.*" {
  drop {}
}</code>

1.3 Grok Handling for Multiple Log Lines

Example log entry and corresponding Grok patterns for request and response sections:

<code>match => {"message" => "^20.*- task request,.*,start time:%{TIMESTAMP_ISO8601:RequestTime}"}
match => {"message" => "^-- Request String : {\"UserName\":%{NUMBER:UserName:int},...}"}
match => {"message" => "^-- Response String : {\"ErrorCode\":%{NUMBER:ErrorCode:int},...}"}</code>

1.4 Multiline Log Merging (Key Point)

Filebeat multiline configuration (recommended):

<code>filebeat.prospectors:
- input_type: log
  paths:
  - /root/performanceTrace*
  multiline.pattern: '.*"WaitInterval":.*-- End'
  multiline.negate: true
  multiline.match: before</code>

Older Filebeat version (using after ):

<code>filebeat.prospectors:
- input_type: log
  paths:
  - /root/performanceTrace*
  multiline.pattern: '^20.*'
  multiline.negate: true
  multiline.match: after</code>

Logstash input multiline (when Filebeat is not used):

<code>input {
  file {
    path => ["/root/logs/log2"]
    start_position => "beginning"
    codec => multiline {
      pattern => "^20.*"
      negate => true
      what => "previous"
    }
  }
}</code>

Logstash filter multiline (not recommended because it forces pipeline workers to 1):

<code>filter {
  multiline {
    pattern => "^20.*"
    negate => true
    what => "previous"
  }
}</code>

1.5 Date Filter Usage

Convert log timestamps to @timestamp:

<code>date {
  match => ["InsertTime", "YYYY-MM-dd HH:mm:ss "]
  remove_field => "InsertTime"
}</code>

2. Multi‑Type Log Classification

Define type fields in Filebeat to separate logs:

<code>filebeat.prospectors:
- paths: [/mnt/data_total/WebApiDebugLog.txt*]
  fields:
    type: WebApiDebugLog_total
- paths: [/mnt/data_request/WebApiDebugLog.txt*]
  fields:
    type: WebApiDebugLog_request
- paths: [/mnt/data_report/WebApiDebugLog.txt*]
  fields:
    type: WebApiDebugLog_report</code>

Use Logstash if statements to apply different filters or outputs based on

[fields][type]

:

<code>filter {
  if [fields][type] == "WebApiDebugLog_request" {
    # request‑specific processing
    if [message] =~ "^20.*- task report,.*,start time.*" {
      drop {}
    }
    grok { match => {"message" => "..."} }
  }
}</code>
<code>output {
  if [fields][type] == "WebApiDebugLog_total" {
    elasticsearch {
      hosts => ["6.6.6.6:9200"]
      index => "logstashl-WebApiDebugLog_total-%{+YYYY.MM.dd}"
      document_type => "WebApiDebugLog_total_logs"
    }
  }
}</code>

3. Overall ELK Performance Optimization

Key observations on a 1 CPU / 4 GB RAM server:

Logstash processes ~500 logs/s; removing Ruby scripts raises it to ~660 logs/s; removing Grok can reach ~1000 logs/s.

Filebeat can ingest 2500‑3500 logs/s, handling ~64 GB per day per node.

Logstash becomes the bottleneck when pulling from Redis; one instance handles ~6000 logs/s, two instances ~10000 logs/s (CPU saturated).

Recommendations:

Increase

pipeline.workers

to match CPU cores.

Adjust

pipeline.output.workers

and

pipeline.batch.size

(e.g., 1000) for higher throughput.

Set appropriate

pipeline.batch.delay

(e.g., 10).

4. Introducing Redis as a Buffer

Use Redis list or pub/sub to decouple Filebeat from Logstash, preventing data loss on Logstash failure. Recommended Redis settings for a pure queue:

<code>bind 0.0.0.0
requirepass ilinux.io
save ""
appendonly no
maxmemory 0</code>

5. Elasticsearch Node Tuning

System parameters (e.g.,

/etc/sysctl.conf

):

<code>vm.swappiness = 1
net.core.somaxconn = 65535
vm.max_map_count = 262144
fs.file-max = 518144</code>

Limits (

/etc/security/limits.conf

) for the

elasticsearch

user:

<code>elasticsearch soft nofile 65535
elasticsearch hard nofile 65535
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited</code>

JVM heap should be set equally for

-Xms

and

-Xmx

, not exceeding 50 % of physical RAM and staying below 32 GB.

Elasticsearch

elasticsearch.yml

optimizations (memory lock, TCP compression, cache sizes, thread‑pool settings) improve stability and query performance.

6. Monitoring and Health Checks

Regularly check CPU, memory, disk I/O, network I/O, and JVM heap usage. Use tools such as

top

,

iostat

,

dstat

, and

iftop

. Ensure Logstash workers are sized appropriately to avoid high CPU caused by garbage collection.

ElasticsearchRedisPerformance TuningELKLogstashFilebeatMultiline
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.