Operations 27 min read

Fluentd Installation, Configuration, and Usage Guide

This guide explains how to install Fluentd via Docker or manual methods, configure the environment (NTP, file descriptors, kernel parameters), use the td-agent package, manage services with systemd, send test logs, and understand Fluentd's event lifecycle, filters, labels, and configuration directives.

DevOps Cloud Academy

Jun 17, 2022

Installation

Fluentd can be installed in several ways, such as using Docker containers or manual installation. Before a manual installation, ensure the host environment is properly configured to avoid inconsistencies.

Environment Configuration

Follow these recommendations:

Set up NTP (e.g., chrony or ntpd) to keep accurate timestamps.

Increase the maximum number of file descriptors.

Optimize kernel network parameters.

Set NTP

It is strongly advised to run an NTP daemon such as chrony or ntpd on each node.

Increase File Descriptor Limit

Check the current limit with:

$ ulimit -n
65535

If the console shows 1024, add the following lines to /etc/security/limits.conf and reboot:

root soft nofile 65536
root hard nofile 65536
* soft nofile 65536
* hard nofile 65536

When running Fluentd under systemd, you can also set LimitNOFILE=65536. The td-agent package sets this value by default.

Optimize Network Kernel Parameters

Add the following settings to /etc/sysctl.conf for high‑load environments with many Fluentd instances:

net.core.somaxconn = 1024
net.core.netdev_max_backlog = 5000
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_wmem = 4096 12582912 16777216
net.ipv4.tcp_rmem = 4096 12582912 16777216
net.ipv4.tcp_max_syn_backlog = 8096
net.ipv4.tcp_slow_start_after_idle = 0
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 10240 65535

Apply the changes with sysctl -p or by restarting the node.

td-agent Package

The stable Fluentd distribution td-agent is maintained by Treasure Data, Inc. and Calyptia, Inc. It is written in Ruby with performance‑critical parts in C.

For Ubuntu Focal, install it with a one‑click script:

curl -fsSL https://toolbelt.treasuredata.com/sh/install-ubuntu-focal-td-agent4.sh | sh

The script creates a systemd service. The generated /lib/systemd/system/td-agent.service looks like:

[Unit]
Description=td-agent: Fluentd based data collector for Treasure Data
Documentation=https://docs.treasuredata.com/display/public/PD/About+Treasure+Data%27s+Server-Side+Agent
After=network-online.target
Wants=network-online.target

[Service]
User=td-agent
Group=td-agent
LimitNOFILE=65536
Environment=LD_PRELOAD=/opt/td-agent/lib/libjemalloc.so
Environment=GEM_HOME=/opt/td-agent/lib/ruby/gems/2.7.0/
Environment=GEM_PATH=/opt/td-agent/lib/ruby/gems/2.7.0/
Environment=FLUENT_CONF=/etc/td-agent/td-agent.conf
Environment=FLUENT_PLUGIN=/etc/td-agent/plugin
Environment=FLUENT_SOCKET=/var/run/td-agent/td-agent.sock
Environment=TD_AGENT_LOG_FILE=/var/log/td-agent/td-agent.log
Environment=TD_AGENT_OPTIONS=
EnvironmentFile=-/etc/default/td-agent
PIDFile=/var/run/td-agent/td-agent.pid
RuntimeDirectory=td-agent
Type=forking
ExecStart=/opt/td-agent/bin/fluentd --log $TD_AGENT_LOG_FILE --daemon /var/run/td-agent/td-agent.pid $TD_AGENT_OPTIONS
ExecStop=/bin/kill -TERM ${MAINPID}
ExecReload=/bin/kill -HUP ${MAINPID}
Restart=always
TimeoutStopSec=120

[Install]
WantedBy=multi-user.target

Manage the service with systemctl:

sudo systemctl start td-agent
sudo systemctl status td-agent

Send a test log:

curl -X POST -d 'json={"json":"message"}' http://localhost:8888/debug.test

Verify the log:

tail -n 1 /var/log/td-agent/td-agent.log
2022-06-11 11:09:02.377608475 +0800 debug.test: {"json":"message"}

Docker Method

Using Docker is often more convenient. Create a simple configuration directory:

mkdir fluentd && cd fluentd
mkdir -p etc logs

Add a basic config file etc/fluentd_basic.conf:

<source>
@type http
port 8888
bind 0.0.0.0
</source>
<match test.basic>
@type stdout
</match>

Run Fluentd in a container, mounting the config and log directories:

docker run -p 8888:8888 --rm -v $(pwd)/etc:/fluentd/etc -v $(pwd)/logs:/fluentd/logs fluent/fluentd:v1.14-1 -c /fluentd/etc/fluentd_basic.conf -v

Send a log entry:

curl -X POST -d 'json={"action":"login","user":100}' http://localhost:8888/test.logs

The container outputs the received event:

2022-06-11 07:34:29.925695338 +0000 test.logs: {"action":"login","user":100}

Event Lifecycle

Each log message is treated as an Event consisting of three parts:

Tag – identifies the source of the event.

Time – Unix timestamp of when the event occurred.

Record – the event payload in JSON format.

For example, an Apache access log parsed by the in_tail plugin becomes:

tag: apache.access
time: 1362020400
record: {"user":"-","method":"GET","code":200,"size":777,"host":"192.168.0.1","path":"/"}

Fluentd then processes the event through a pipeline of field modifications, filters, and routing.

Filter Example

A filter can accept or reject events. Create etc/fluentd_filter.conf to exclude logs where action equals logout:

<source>
@type http
port 8888
bind 0.0.0.0
</source>
<filter test.logs>
@type grep
<exclude>
key action
pattern ^logout$
</exclude>
</filter>
<match test.logs>
@type stdout
</match>

Restart Fluentd with the new config and send two logs; only the login event is kept.

Labels

Labels ( @label) allow separate processing pipelines for different inputs. Example fluentd_labels.conf:

<source>
@type http
port 8888
bind 0.0.0.0
@label @TEST
</source>

<filter test.logs>
@type grep
<exclude>
key action
pattern ^login$
</exclude>
</filter>

<label @TEST>
<filter test.logs>
@type grep
<exclude>
key action
pattern ^logout$
</exclude>
</filter>
<match test.logs>
@type stdout
</match>
</label>

Running Fluentd with this config and sending both login and logout events results in only the login event being output.

Configuration File Directives

Fluentd configuration uses several directives: source – defines input plugins. match – defines output destinations. filter – defines event processing pipelines. system – sets system‑wide options. label – groups internal routing and filters. @include – includes external configuration files. parse – specifies how raw logs are parsed.

Example of a tail source reading Apache logs:

<source>
@type tail
path /var/log/httpd-access.log
pos_file /var/log/td-agent/httpd-access.log.pos
tag apache.access
<parse>
@type apache2
</parse>
</source>

Example of a match that writes events to files with buffering:

<match pattern>
@type file
path /var/log/fluent/myapp
compress gzip
<buffer>
timekey 1d
timekey_use_utc true
timekey_wait 10m
</buffer>
</match>

Output plugins include out_file, out_stdout, out_s3, out_elasticsearch, etc., each supporting non‑buffered, synchronous, or asynchronous modes.

Record Transformer Example

<source>
@type http
port 9880
</source>

<filter myapp.access>
@type record_transformer
<record>
host_param "#{Socket.gethostname}"
</record>
</filter>

<match myapp.access>
@type file
path /var/log/fluent/access
</match>

The transformer adds a host_param field to each event before it is written to a file.

System Directive Example

<system>
log_level error
without_source
</system>

Setting process_name changes the process name shown by ps:

<system>
process_name fluentd1
</system>

Label Example with @SYSTEM

<source>
@type forward
</source>

<source>
@type tail
@label @SYSTEM
</source>

<filter access.**>
@type record_transformer
<record>
# ...
</record>
</filter>

<match **>
@type elasticsearch
</match>

<label @SYSTEM>
<filter var.log.middleware.**>
@type grep
# ...
</filter>
<match **>
@type s3
# ...
</match>
</label>

Built‑in tags @ERROR and @ROOT handle error events and default routing respectively.

Parse Directive

Parsing can be done with built‑in parsers such as json, nginx, multiline, etc. Example multiline parser for a Rails request log:

<parse>
@type multiline
format_firstline /^Started/
format1 /Started (?<method>[^ ]+) "(?<path>[^"]+)" for (?<host>[^ ]+) at (?<time>[^ ]+ [^ ]+ [^ ]+)
/
format2 /Processing by (?<controller>[^#]+)#(?<controller_method>[^ ]+) as (?<format>[^ ]+)
/
format3 /( Parameters: (?<parameters>[^ ]+)
)?/
format4 / Rendered (?<template>[^ ]+) within (?<layout>.+) \([\d\.]+ms\)
/
format5 /Completed (?<code>[^ ]+) [^ ]+ in (?<runtime>[\d\.]+)ms \(Views: (?<view_runtime>[\d\.]+)ms \| ActiveRecord: (?<ar_runtime>[\d\.]+)ms\)/
</parse>

Another example parses Java stack traces:

<parse>
@type multiline
format_firstline /\d{4}-\d{1,2}-\d{1,2}/
format1 /^(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}) \[(?<thread>.*)\] (?<level>[^\s]+)(?<message>.*)/
</parse>

Pattern Matching

Fluentd routes events based on tags using wildcards ( *, **), sets ( {a,b}), and Ruby interpolation ( #{...}). Order matters: specific tags should be placed before generic ones.

Reference Documentation

For more details, consult the official Fluentd documentation at https://docs.fluentd.org .

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Docker log collection Systemd Fluentd td-agent

Written by

DevOps Cloud Academy

Exploring industry DevOps practices and technical expertise.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.