Centralized Log Collection with Filebeat and Graylog
This article explains how to use Filebeat as a lightweight log shipper together with Graylog, Elasticsearch, and MongoDB to collect, process, and visualize logs from multiple environments, including detailed configuration examples, deployment scripts, and integration with Docker and Spring Boot.
When a company runs many services across test and production environments, centralized log collection becomes essential. The article compares using Nginx to expose logs versus dedicated log collection solutions like ELK, and recommends Graylog as a simpler alternative that integrates Elasticsearch for storage and MongoDB for configuration.
Filebeat Overview
① Filebeat Log Shipping Service – Filebeat monitors specified log directories or files, reads new entries, and forwards them to Elasticsearch, Logstash, or Graylog.
② Filebeat Workflow – After installation, Filebeat starts one or more prospectors to detect log files, each spawning a harvester that reads the latest content and sends events to a spooler before forwarding them to the configured destination (e.g., Graylog).
③ Filebeat Diagram – Filebeat is lighter than Logstash and is recommended for environments with limited resources.
Filebeat configuration files are typically located at /etc/filebeat/filebeat.yml for RPM/DEB installations, with input definitions placed in the inputs.d directory.
# Configure input sources
filebeat.config.inputs:
enabled: true
path: ${path.config}/inputs.d/*.yml
# Enable modules
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
setup.template.settings:
index.number_of_shards: 1
output.logstash:
hosts: ["11.22.33.44:5500"]
processors:
- add_host_metadata: ~
- rename:
fields:
- from: "log"
to: "message"
- add_fields:
target: ""
fields:
token: "0uxxxxaM-1111-2222-3333-VQZJxxxxxwgX "A sample inputs.d YAML file shows how to collect specific log files, filter lines, add tags, and handle multiline patterns.
# Collect log type
- type: log
enabled: true
paths:
- /var/log/supervisor/app_escape_worker-stderr.log
- /var/log/supervisor/app_escape_prod-stderr.log
symlinks: true
include_lines: ["WARNING", "ERROR"]
tags: ["app", "escape", "test"]
multiline.pattern: '^\[?[0-9]...{3}'
multiline.negate: true
multiline.match: afterGraylog provides a web UI for log ingestion, extraction, stream routing, and indexing. It uses Inputs to receive logs, Extractors to parse fields, Streams to route logs, and Index Sets for storage. Pipelines can further filter logs, e.g., discarding messages with level > 6:
rule "discard debug messages" when to_long($message.level) > 6 then drop_message(); endDeployment instructions cover installing Filebeat via DEB/RPM packages or Docker, and deploying Graylog with Docker Compose, including generating a password secret and root password hash.
# Ubuntu (deb)
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.8.1-amd64.deb
sudo dpkg -i filebeat-7.8.1-amd64.deb
sudo systemctl enable filebeat
sudo service filebeat start
# Docker run
docker run -d --name=filebeat --user=root \
--volume "./filebeat.docker.yml:/usr/share/filebeat/filebeat.yml:ro" \
--volume "/var/lib/docker/containers:/var/lib/docker/containers:ro" \
--volume "/var/run/docker.sock:/var/run/docker.sock:ro" \
docker.elastic.co/beats/filebeat:7.8.1 filebeat -e -strict.perms=false \
-E output.elasticsearch.hosts=["elasticsearch:9200"]The Graylog Docker Compose file defines services for MongoDB, Elasticsearch, and Graylog, exposing ports for web UI (9000), Filebeat input (5044), GELF TCP/UDP (12201), and Syslog (1514).
version: "3"
services:
mongo:
restart: on-failure
container_name: graylog_mongo
image: "mongo:3"
volumes:
- "./mongodb:/data/db"
networks:
- graylog_network
elasticsearch:
restart: on-failure
container_name: graylog_es
image: "elasticsearch:6.8.5"
volumes:
- "./es_data:/usr/share/elasticsearch/data"
environment:
- http.host=0.0.0.0
- transport.host=localhost
- network.host=0.0.0.0
- "ES_JAVA_OPTS=-Xms512m -Xmx5120m"
ulimits:
memlock:
soft: -1
hard: -1
deploy:
resources:
limits:
memory: 12g
networks:
- graylog_network
graylog:
restart: on-failure
container_name: graylog_web
image: "graylog/graylog:3.3"
ports:
- "9000:9000"
- "5044:5044"
- "12201:12201"
- "12201:12201/udp"
- "1514:1514"
- "1514:1514/udp"
volumes:
- "./graylog_journal:/usr/share/graylog/data/journal"
environment:
- GRAYLOG_PASSWORD_SECRET=zscMb65...FxR9ag
- GRAYLOG_ROOT_PASSWORD_SHA2=77e29e0f...557515f
- GRAYLOG_HTTP_EXTERNAL_URI=http://11.22.33.44:9000/
- GRAYLOG_TIMEZONE=Asia/Shanghai
- GRAYLOG_ROOT_TIMEZONE=Asia/Shanghai
networks:
- graylog
depends_on:
- mongo
- elasticsearch
networks:
graylog_network:
driver: bridgeDocker containers can send logs directly to Graylog using the GELF driver, e.g.,
# Docker run with GELF driver
docker run --rm=true \
--log-driver=gelf \
--log-opt gelf-address=udp://11.22.33.44:12201 \
--log-opt tag=myapp myapp:0.0.1Integration with Spring Boot is demonstrated by adding the logback-gelf dependency and configuring a logback.xml appender that sends logs to Graylog via UDP.
<appender name="GELF" class="de.siegmar.logbackgelf.GelfUdpAppender">
<graylogHost>ip</graylogHost>
<graylogPort>12201</graylogPort>
<maxChunkSize>508</maxChunkSize>
<useCompression>true</useCompression>
<encoder class="de.siegmar.logbackgelf.GelfEncoder">
<includeRawMessage>false</includeRawMessage>
<includeMarker>true</includeMarker>
<includeMdcData>true</includeMdcData>
<includeLevelName>true</includeLevelName>
<shortPatternLayout class="ch.qos.logback.classic.PatternLayout">
<pattern>%m%nopex</pattern>
</shortPatternLayout>
<fullPatternLayout class="ch.qos.logback.classic.PatternLayout">
<pattern>%d - [%thread] %-5level %logger{35} - %msg%n</pattern>
</fullPatternLayout>
<staticField>app_name:austin</staticField>
</encoder>
</appender>Finally, the article provides examples of log search queries in Graylog, such as fuzzy search, exact match, field-specific queries, multi-field, and combined condition queries, enabling users to efficiently locate relevant log entries.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.