Operations 19 min read

Centralized Log Collection with Filebeat and Graylog

This article explains how to use Filebeat together with Graylog to collect, process, and visualize logs from multiple services and environments, covering tool introductions, configuration files, component details, deployment methods, and practical code examples.

Top Architect
Top Architect
Top Architect
Centralized Log Collection with Filebeat and Graylog

When a company runs many services across test and production environments, centralized log collection becomes essential. The article compares using Nginx for external log exposure versus a dedicated log collection service like ELK, and recommends Graylog as a lightweight, extensible alternative that saves effort.

1. Filebeat Tool Introduction

Log collection solution: Filebeat + Graylog!

[1] Filebeat – Log file shipping service

Filebeat is a log shipper that monitors specified log directories or files, reads new entries continuously, and forwards them to elasticsearch , logstash , or graylog .

2. Filebeat Configuration File

The core of configuring Filebeat is writing its configuration file.

The default configuration file is /etc/filebeat/filebeat.yml for RPM/DEB installations. For Mac or Windows, refer to the extracted files. The main configuration includes the inputs.d directory where all .yml files define log sources.

# Configure input sources
# All files under inputs.d are loaded
filebeat.config.inputs:
  enabled: true
  path: ${path.config}/inputs.d/*.yml
  # Uncomment for JSON logs
  # json.keys_under_root: true

# Load modules
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

setup.template.settings:
  index.number_of_shards: 1

# Output to Logstash (Graylog)
output.logstash:
  hosts: ["11.22.33.44:5500"]

processors:
  - add_host_metadata: ~
  - rename:
      fields:
        - from: "log"
          to: "message"
  - add_fields:
      target: ""
      fields:
        # Token to prevent unauthenticated data submission
        token: "0uxxxxaM-1111-2222-3333-VQZJxxxxxwgX "

A simple inputs.d example shows how to collect logs from specific files, filter by keywords, and add tags.

# Log type definition
- type: log
  enabled: true
  paths:
    - /var/log/supervisor/app_escape_worker-stderr.log
    - /var/log/supervisor/app_escape_prod-stderr.log
  symlinks: true
  include_lines: ["WARNING", "ERROR"]
  tags: ["app", "escape", "test"]
  multiline.pattern: '^\[?[0-9]...{3}'
  multiline.negate: true
  multiline.match: after

Filebeat also provides modules for common services such as PostgreSQL, Redis, and iptables.

# iptables module
- module: iptables
  log:
    enabled: true
    var.paths: ["/var/log/iptables.log"]
    var.input: "file"

# postgresql module
- module: postgresql
  log:
    enabled: true
    var.paths: ["/path/to/log/postgres/*.log*"]

# nginx module
- module: nginx
  access:
    enabled: true
    var.paths: ["/path/to/log/nginx/access.log*"]
  error:
    enabled: true
    var.paths: ["/path/to/log/nginx/error.log*"]

3. Graylog Service Introduction

Log collection solution: Filebeat + Graylog!

[1] Graylog – Log monitoring system

Graylog is an open‑source log aggregation, analysis, and alerting platform. Compared with ELK, it is simpler to deploy but less extensible; a commercial version is also available.

In a typical deployment, Graylog consists of three components: Elasticsearch for storing and searching logs, MongoDB for Graylog configuration, and the Graylog server itself for the web UI and APIs.

Number

Component

Function

Key Features

1

Dashboards

Fixed data panels

Save specific search‑based panels

2

Searching

Conditional log search

Keyword, time, saved searches, panels, grouping, export, highlighting, custom time

3

Alert

Alert configuration

Email, HTTP callback, custom script

4

Inputs

Log ingestion

Sidecar active collection or passive reporting

5

Extractors

Log field conversion

JSON, KV, timestamp, regex parsing

6

Streams

Log classification

Route logs to different indices

7

Indices

Persistent storage

Configure storage performance

8

Outputs

Log forwarding

Send streams to other Graylog clusters or services

9

Pipelines

Log filtering

Define cleaning rules, field add/remove, conditional filters, custom functions

10

Sidecar

Lightweight collector

Client‑server mode for large scale

Graylog processes logs through Inputs → Extractors → Streams → Pipelines, allowing end‑to‑end handling without extra post‑processing.

rule "discard debug messages"
when
  to_long($message.level) > 6
then
  drop_message();
end

4. Service Installation and Deployment

Main steps to deploy Filebeat + Graylog

Filebeat can be installed via RPM/DEB packages, source compilation, Docker, or Kubernetes. Example for Ubuntu (DEB):

# Ubuntu (deb)
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.8.1-amd64.deb
sudo dpkg -i filebeat-7.8.1-amd64.deb
sudo systemctl enable filebeat
sudo service filebeat start

Docker example:

# Run Filebeat container
docker run -d --name=filebeat --user=root \
  --volume "./filebeat.docker.yml:/usr/share/filebeat/filebeat.yml:ro" \
  --volume "/var/lib/docker/containers:/var/lib/docker/containers:ro" \
  --volume "/var/run/docker.sock:/var/run/docker.sock:ro" \
  docker.elastic.co/beats/filebeat:7.8.1 filebeat -e -strict.perms=false \
  -E output.elasticsearch.hosts=["elasticsearch:9200"]

Graylog can be deployed with Docker Compose. First generate a 16‑character password_secret and a SHA‑256 hash for the admin password, then place them in docker-compose.yml :

# Generate password_secret (at least 16 chars)
sudo apt install -y pwgen
pwgen -N 1 -s 16
# Generate SHA‑256 of admin password
echo -n "Enter Password: " && head -1 /dev/stdin | tr -d '\n' | sha256sum | cut -d " " -f1
version: "3"
services:
  mongo:
    restart: on-failure
    container_name: graylog_mongo
    image: "mongo:3"
    volumes:
      - "./mongodb:/data/db"
    networks:
      - graylog_network

  elasticsearch:
    restart: on-failure
    container_name: graylog_es
    image: "elasticsearch:6.8.5"
    volumes:
      - "./es_data:/usr/share/elasticsearch/data"
    environment:
      - http.host=0.0.0.0
      - transport.host=localhost
      - network.host=0.0.0.0
      - "ES_JAVA_OPTS=-Xms512m -Xmx5120m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    deploy:
      resources:
        limits:
          memory: 12g
    networks:
      - graylog_network

  graylog:
    restart: on-failure
    container_name: graylog_web
    image: "graylog/graylog:3.3"
    ports:
      - "9000:9000"   # Web UI
      - "5044:5044"   # Filebeat input
      - "12201:12201"   # GELF TCP
      - "12201:12201/udp"   # GELF UDP
      - "1514:1514"   # Syslog TCP
      - "1514:1514/udp"   # Syslog UDP
    volumes:
      - "./graylog_journal:/usr/share/graylog/data/journal"
    environment:
      - GRAYLOG_PASSWORD_SECRET=zscMb65...FxR9ag
      - GRAYLOG_ROOT_PASSWORD_SHA2=77e29e0f...557515f
      - GRAYLOG_HTTP_EXTERNAL_URI=http://11.22.33.44:9000/
      - GRAYLOG_TIMEZONE=Asia/Shanghai
      - GRAYLOG_ROOT_TIMEZONE=Asia/Shanghai
    networks:
      - graylog_network
    depends_on:
      - mongo
      - elasticsearch

networks:
  graylog_network:
    driver: bridge

When using Docker containers, the GELF log driver can send logs directly to Graylog:

# Run a container with GELF driver
docker run --rm=true \
    --log-driver=gelf \
    --log-opt gelf-address=udp://11.22.33.44:12201 \
    --log-opt tag=myapp \
    myapp:0.0.1

Docker‑Compose example for a service using the GELF driver:

version: "3"
services:
  redis:
    restart: always
    image: redis
    container_name: "redis"
    logging:
      driver: gelf
      options:
        gelf-address: udp://11.22.33.44:12201
        tag: "redis"
    # ... other services

5. Graylog Web Interface Features

Overview of Graylog UI functions and characteristics

The UI provides dashboards, search, alerts, streams, inputs, extractors, pipelines, sidecar management, and more, allowing users to visualize, query, and manage logs efficiently.

Additional resources and promotional links are included at the end of the original article.

monitoringDockerconfigurationELKlog collectionFilebeatGraylog
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.