Fluentd Installation, Configuration, and Usage Guide
This guide explains how to install Fluentd via Docker or manual methods, configure the environment (NTP, file descriptors, kernel parameters), use the td-agent package, manage services with systemd, send test logs, and understand Fluentd's event lifecycle, filters, labels, and configuration directives.
Installation
Fluentd can be installed in several ways, such as using Docker containers or manual installation. Before a manual installation, ensure the host environment is properly configured to avoid inconsistencies.
Environment Configuration
Follow these recommendations:
Set up NTP (e.g., chrony or ntpd ) to keep accurate timestamps.
Increase the maximum number of file descriptors.
Optimize kernel network parameters.
Set NTP
It is strongly advised to run an NTP daemon such as chrony or ntpd on each node.
Increase File Descriptor Limit
Check the current limit with:
$ ulimit -n
65535If the console shows 1024 , add the following lines to /etc/security/limits.conf and reboot:
root soft nofile 65536
root hard nofile 65536
* soft nofile 65536
* hard nofile 65536When running Fluentd under systemd , you can also set LimitNOFILE=65536 . The td-agent package sets this value by default.
Optimize Network Kernel Parameters
Add the following settings to /etc/sysctl.conf for high‑load environments with many Fluentd instances:
net.core.somaxconn = 1024
net.core.netdev_max_backlog = 5000
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_wmem = 4096 12582912 16777216
net.ipv4.tcp_rmem = 4096 12582912 16777216
net.ipv4.tcp_max_syn_backlog = 8096
net.ipv4.tcp_slow_start_after_idle = 0
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 10240 65535Apply the changes with sysctl -p or by restarting the node.
td-agent Package
The stable Fluentd distribution td-agent is maintained by Treasure Data, Inc. and Calyptia, Inc. It is written in Ruby with performance‑critical parts in C.
For Ubuntu Focal, install it with a one‑click script:
curl -fsSL https://toolbelt.treasuredata.com/sh/install-ubuntu-focal-td-agent4.sh | shThe script creates a systemd service. The generated /lib/systemd/system/td-agent.service looks like:
[Unit]
Description=td-agent: Fluentd based data collector for Treasure Data
Documentation=https://docs.treasuredata.com/display/public/PD/About+Treasure+Data%27s+Server-Side+Agent
After=network-online.target
Wants=network-online.target
[Service]
User=td-agent
Group=td-agent
LimitNOFILE=65536
Environment=LD_PRELOAD=/opt/td-agent/lib/libjemalloc.so
Environment=GEM_HOME=/opt/td-agent/lib/ruby/gems/2.7.0/
Environment=GEM_PATH=/opt/td-agent/lib/ruby/gems/2.7.0/
Environment=FLUENT_CONF=/etc/td-agent/td-agent.conf
Environment=FLUENT_PLUGIN=/etc/td-agent/plugin
Environment=FLUENT_SOCKET=/var/run/td-agent/td-agent.sock
Environment=TD_AGENT_LOG_FILE=/var/log/td-agent/td-agent.log
Environment=TD_AGENT_OPTIONS=
EnvironmentFile=-/etc/default/td-agent
PIDFile=/var/run/td-agent/td-agent.pid
RuntimeDirectory=td-agent
Type=forking
ExecStart=/opt/td-agent/bin/fluentd --log $TD_AGENT_LOG_FILE --daemon /var/run/td-agent/td-agent.pid $TD_AGENT_OPTIONS
ExecStop=/bin/kill -TERM ${MAINPID}
ExecReload=/bin/kill -HUP ${MAINPID}
Restart=always
TimeoutStopSec=120
[Install]
WantedBy=multi-user.targetManage the service with systemctl :
sudo systemctl start td-agent
sudo systemctl status td-agentSend a test log:
curl -X POST -d 'json={"json":"message"}' http://localhost:8888/debug.testVerify the log:
tail -n 1 /var/log/td-agent/td-agent.log
2022-06-11 11:09:02.377608475 +0800 debug.test: {"json":"message"}Docker Method
Using Docker is often more convenient. Create a simple configuration directory:
mkdir fluentd && cd fluentd
mkdir -p etc logsAdd a basic config file etc/fluentd_basic.conf :
@type http
port 8888
bind 0.0.0.0
@type stdoutRun Fluentd in a container, mounting the config and log directories:
docker run -p 8888:8888 --rm -v $(pwd)/etc:/fluentd/etc -v $(pwd)/logs:/fluentd/logs fluent/fluentd:v1.14-1 -c /fluentd/etc/fluentd_basic.conf -vSend a log entry:
curl -X POST -d 'json={"action":"login","user":100}' http://localhost:8888/test.logsThe container outputs the received event:
2022-06-11 07:34:29.925695338 +0000 test.logs: {"action":"login","user":100}Event Lifecycle
Each log message is treated as an Event consisting of three parts:
Tag – identifies the source of the event.
Time – Unix timestamp of when the event occurred.
Record – the event payload in JSON format.
For example, an Apache access log parsed by the in_tail plugin becomes:
tag: apache.access
time: 1362020400
record: {"user":"-","method":"GET","code":200,"size":777,"host":"192.168.0.1","path":"/"}Fluentd then processes the event through a pipeline of field modifications, filters, and routing.
Filter Example
A filter can accept or reject events. Create etc/fluentd_filter.conf to exclude logs where action equals logout :
@type http
port 8888
bind 0.0.0.0
@type grep
key action
pattern ^logout$
@type stdoutRestart Fluentd with the new config and send two logs; only the login event is kept.
Labels
Labels ( @label ) allow separate processing pipelines for different inputs. Example fluentd_labels.conf :
@type http
port 8888
bind 0.0.0.0
@label @TEST
@type grep
key action
pattern ^login$
@type grep
key action
pattern ^logout$
@type stdoutRunning Fluentd with this config and sending both login and logout events results in only the login event being output.
Configuration File Directives
Fluentd configuration uses several directives:
source – defines input plugins.
match – defines output destinations.
filter – defines event processing pipelines.
system – sets system‑wide options.
label – groups internal routing and filters.
@include – includes external configuration files.
parse – specifies how raw logs are parsed.
Example of a tail source reading Apache logs:
@type tail
path /var/log/httpd-access.log
pos_file /var/log/td-agent/httpd-access.log.pos
tag apache.access
@type apache2Example of a match that writes events to files with buffering:
@type file
path /var/log/fluent/myapp
compress gzip
timekey 1d
timekey_use_utc true
timekey_wait 10mOutput plugins include out_file , out_stdout , out_s3 , out_elasticsearch , etc., each supporting non‑buffered, synchronous, or asynchronous modes.
Record Transformer Example
@type http
port 9880
@type record_transformer
host_param "#{Socket.gethostname}"
@type file
path /var/log/fluent/accessThe transformer adds a host_param field to each event before it is written to a file.
System Directive Example
log_level error
without_sourceSetting process_name changes the process name shown by ps :
process_name fluentd1Label Example with @SYSTEM
@type forward
@type tail
@label @SYSTEM
@type record_transformer
# ...
@type elasticsearch
@type grep
# ...
@type s3
# ...Built‑in tags @ERROR and @ROOT handle error events and default routing respectively.
Parse Directive
Parsing can be done with built‑in parsers such as json , nginx , multiline , etc. Example multiline parser for a Rails request log:
@type multiline
format_firstline /^Started/
format1 /Started (?
[^ ]+) "(?
[^"]+)" for (?
[^ ]+) at (?
[^ ]+ [^ ]+ [^ ]+)\n/
format2 /Processing by (?
[^#]+)#(?
[^ ]+) as (?
[^ ]+)\n/
format3 /( Parameters: (?
[^ ]+)\n)?/
format4 / Rendered (?Another example parses Java stack traces:
@type multiline
format_firstline /\d{4}-\d{1,2}-\d{1,2}/
format1 /^(?
\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}) \[(?
.*)\] (?
[^\s]+)(?
.*)/Pattern Matching
Fluentd routes events based on tags using wildcards ( * , ** ), sets ( {a,b} ), and Ruby interpolation ( #{...} ). Order matters: specific tags should be placed before generic ones.
Reference Documentation
For more details, consult the official Fluentd documentation at https://docs.fluentd.org .
DevOps Cloud Academy
Exploring industry DevOps practices and technical expertise.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.