Comprehensive Guide to Deploying Filebeat and Graylog for Centralized Log Collection
This article explains how to use Filebeat and Graylog together for centralized log collection, covering Filebeat’s role, configuration files, input modules, Graylog’s architecture, pipeline rules, and step‑by‑step deployment using Docker and docker‑compose, providing practical commands and examples for operational environments.
When an organization runs many services across test and production environments, centralized log collection becomes essential; the article compares using Nginx versus a dedicated ELK stack and introduces Graylog as a simpler, extensible alternative that leverages Elasticsearch for storage and MongoDB for configuration.
Filebeat is presented as a lightweight log shipper that monitors specified directories or files, spawns prospectors to detect log files, harvesters to read new entries, and a spooler to batch events before sending them to a destination such as Graylog.
Key parts of the Filebeat configuration are shown, including the main filebeat.yml file that defines input paths, module loading, index settings, and output to Logstash or Graylog.
# Configure input sources
filebeat.config.inputs:
enabled: true
path: ${path.config}/inputs.d/*.yml
# Load modules
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
# Output to Graylog (GELF)
output.logstash:
hosts: ["11.22.33.44:5500"]
processors:
- add_host_metadata: ~
- rename:
fields:
- from: "log"
to: "message"
- add_fields:
target: ""
fields:
token: "0uxxxxaM-1111-2222-3333-VQZJxxxxxwgX"An example of an inputs.d YAML file demonstrates how to collect logs from specific paths, filter by keywords, tag data, and handle multiline stack traces.
# Collect log type
- type: log
enabled: true
paths:
- /var/log/supervisor/app_escape_worker-stderr.log
- /var/log/supervisor/app_escape_prod-stderr.log
symlinks: true
include_lines: ["WARNING", "ERROR"]
tags: ["app", "escape", "test"]
multiline.pattern: '^\[?[0-9]...{3}'
multiline.negate: true
multiline.match: afterGraylog’s architecture consists of three core components—Elasticsearch for persisting and searching log data, MongoDB for storing Graylog configuration, and the Graylog server itself providing a web UI and APIs. Both single‑node and clustered deployments are illustrated.
Graylog processes logs through Inputs, Extractors, Streams, and optional Pipelines. A sample pipeline rule that discards messages with a level greater than 6 is provided.
rule "discard debug messages"
when
to_long($message.level) > 6
then
drop_message();
endDeployment instructions for Filebeat cover installation via DEB/RPM packages, Docker container execution, and the necessary command‑line options to connect to Graylog.
# Ubuntu (deb)
$ curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.8.1-amd64.deb
$ sudo dpkg -i filebeat-7.8.1-amd64.deb
$ sudo systemctl enable filebeat
$ sudo service filebeat start
# Docker run
docker run -d --name=filebeat --user=root \
--volume="./filebeat.docker.yml:/usr/share/filebeat/filebeat.yml:ro" \
--volume="/var/lib/docker/containers:/var/lib/docker/containers:ro" \
--volume="/var/run/docker.sock:/var/run/docker.sock:ro" \
docker.elastic.co/beats/filebeat:7.8.1 -e -strict.perms=false \
-E output.elasticsearch.hosts=["elasticsearch:9200"]Graylog is deployed with Docker‑Compose. The article shows how to generate a 16‑character password secret and a SHA‑256 root password, then provides a complete docker‑compose.yml that defines MongoDB, Elasticsearch, and Graylog services with appropriate ports and environment variables.
version: "3"
services:
mongo:
restart: on-failure
container_name: graylog_mongo
image: "mongo:3"
volumes:
- "./mongodb:/data/db"
networks:
- graylog_network
elasticsearch:
restart: on-failure
container_name: graylog_es
image: "elasticsearch:6.8.5"
volumes:
- "./es_data:/usr/share/elasticsearch/data"
environment:
- http.host=0.0.0.0
- transport.host=localhost
- network.host=0.0.0.0
- "ES_JAVA_OPTS=-Xms512m -Xmx5120m"
ulimits:
memlock:
soft: -1
hard: -1
deploy:
resources:
limits:
memory: 12g
networks:
- graylog_network
graylog:
restart: on-failure
container_name: graylog_web
image: "graylog/graylog:3.3"
ports:
- 9000:9000 # Web UI
- 5044:5044 # Filebeat input
- 12201:12201 # GELF TCP
- 12201:12201/udp # GELF UDP
- 1514:1514 # Syslog TCP
- 1514:1514/udp # Syslog UDP
volumes:
- "./graylog_journal:/usr/share/graylog/data/journal"
environment:
- GRAYLOG_PASSWORD_SECRET=zscMb65...FxR9ag
- GRAYLOG_ROOT_PASSWORD_SHA2=77e29e0f...557515f
- GRAYLOG_HTTP_EXTERNAL_URI=http://11.22.33.44:9000/
- GRAYLOG_TIMEZONE=Asia/Shanghai
- GRAYLOG_ROOT_TIMEZONE=Asia/Shanghai
networks:
- graylog
depends_on:
- mongo
- elasticsearch
networks:
graylog_network:
driver: bridgeThe Sidecar component is described as a lightweight log collector that can run on Linux or Windows, fetches its configuration from Graylog via REST API, and supports Beats, CEF, GELF, JSON, and NetFlow outputs. Using the GELF driver, Docker containers can forward logs directly to Graylog.
# Docker run with GELF driver
docker run --rm=true \
--log-driver=gelf \
--log-opt gelf-address=udp://11.22.33.44:12201 \
--log-opt tag=myapp \
myapp:0.0.1Finally, the article briefly showcases the Graylog web UI, highlighting its search, stream, and dashboard capabilities, and provides links to additional resources and community groups.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.