Cloud Native 8 min read

Designing a Scalable Kubernetes Log Collection System Using S6 and Filebeat

This article explains the limitations of Docker‑based logging, compares logging drivers, and presents a Kubernetes‑wide log collection architecture that uses an S6‑based base image, Filebeat, logrotate, Kafka, and Elasticsearch to achieve reliable, scalable log aggregation.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Designing a Scalable Kubernetes Log Collection System Using S6 and Filebeat

Background and Problems with Docker Logging

Docker generates two kinds of logs: engine logs (handled by the host’s system logger) and container logs, which are written to

/var/lib/docker/containers/<container_id>/<container_id>-json.log

in JSON format. In production this approach has three major drawbacks: unlimited log file growth, Docker daemon becoming a bottleneck for high‑volume logs, and blocking of docker logs -f and other CLI commands.

Docker’s logging drivers provide different performance characteristics, e.g.:

log-driver   speed
syslog       14.9 MB/s
json-file    37.9 MB/s

To avoid the daemon bottleneck, the article proposes redirecting container stdout/stderr directly to host files using the S6‑log utility in a custom base image.

Kubernetes Log Collection Levels

Kubernetes log collection can be organized at three levels:

Pod (application) level

Node level

Cluster level

Pod Level

Pods write logs to stdout/stderr, which can be accessed with kubectl logs pod-name -n namespace.

Node Level

Node‑level logging configures a Docker log-driver together with logrotate to automatically rotate large log files.

Cluster Level

Three common cluster‑wide approaches are described:

Node‑agent (DaemonSet) : Deploy a log‑agent DaemonSet on every node; low resource usage and non‑intrusive to applications, but requires all containers to log to stdout.

Sidecar container : Run a logging sidecar in each pod. Two variants exist:

Streaming sidecar that forwards the application’s stdout/stderr, resulting in duplicate log files on the host.

Dedicated log‑collector sidecar (e.g., Logstash or Fluent Bit) that writes logs to a backend, consuming more CPU/memory and hiding logs from kubectl logs.

Application‑direct logging : Applications push logs directly to a storage backend (e.g., Elasticsearch, Loki) without using stdout.

Proposed Unified Log Architecture

The recommended architecture combines the node‑agent approach with a custom log‑agent container built from an S6 base image. The flow is:

All application containers use the S6 base image; logs are redirected to host directories such as /data/logs/namespace/appname/podname/log/xxxx.log.

The log‑agent runs Filebeat and logrotate; Filebeat watches the log files and ships them to a Kafka topic.

Kafka forwards logs to Elasticsearch; Logstash consumes Kafka messages, creates indices, and stores logs for Kibana visualization.

Implementation Challenges

Automatically updating Filebeat configuration when new applications are deployed.

Ensuring each log file is rotated correctly.

Extending Filebeat with custom modules for additional functionality.

Practical Solutions

To address the challenges, the article suggests building a log‑agent DaemonSet that includes:

Use of github.com/fsnotify/fsnotify to watch log directories for create/delete events and regenerate Filebeat config via templating.

Use of github.com/robfig/cron to schedule periodic logrotate jobs. Example logrotate snippet:

/var/log/xxxx/xxxxx.log {
    su www-data www-data
    missingok
    notifempty
    size 1G
    copytruncate
}

Conclusion

The article provides a practical blueprint for Kubernetes log collection, emphasizing a node‑agent architecture with S6‑based containers, Filebeat, Kafka, and Elasticsearch. Organizations can adapt the design to their specific requirements and extend it as needed.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KubernetesloggingDaemonSetFilebeatlogrotateS6
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.