Operations 9 min read

How to Build a Scalable Container Log Architecture with S6 and Filebeat

This guide explains Docker container logging pitfalls, compares Kubernetes log collection levels, and presents a unified log‑aggregation architecture using S6‑based images, Filebeat, logrotate, Kafka, and Logstash, with practical steps for dynamic configuration and rotation.

dbaplus Community
dbaplus Community
dbaplus Community
How to Build a Scalable Container Log Architecture with S6 and Filebeat

Container Log Basics

Docker stores container stdout/stderr in JSON files located at

/var/lib/docker/containers/<container_id>/<container_id>-json.log

. The default json-file driver does not limit file size, so logs can grow without bound and exhaust disk space. The Docker daemon also reads these files; when log volume is high the daemon becomes a bottleneck and commands such as docker logs -f can block the daemon, making docker ps unresponsive.

Benchmark results show syslog driver throughput of ~14.9 MB/s versus json-file at ~37.9 MB/s. To avoid daemon collection, the article uses an S6‑based image where s6-log redirects the container’s CMD stdout to a file on the host, bypassing the daemon.

Kubernetes Log Levels

Log collection in Kubernetes can be organized into three levels:

Pod (application) level : Applications write to stdout/stderr; logs are accessed with kubectl logs.

Node level : Configure a container log driver (e.g., json-file) together with logrotate to rotate files when they exceed a size limit.

Cluster level :

Node‑side DaemonSet that runs a lightweight collector on every node, handling only stdout logs.

Sidecar container per pod that either streams logs to stdout (creating duplicate files) or runs a dedicated collector such as Logstash or Fluentd inside the pod (higher CPU/memory usage and logs are hidden from kubectl logs).

Application‑side push directly to a backend storage service.

Unified Log Architecture

The proposed architecture uses a node‑level DaemonSet ( log‑agent) to collect logs from containers built on the S6 base image. The data flow is:

Application containers write logs to host directories, e.g.

/data/logs/<namespace>/<appname>/<podname>/log/xxxx.log

.

The log‑agent pod on each node runs filebeat and logrotate. filebeat tails the log files and forwards them to a Kafka topic.

Kafka feeds the logs into Elasticsearch; logstash creates indices and processes the data for Kibana visualization.

Key challenges addressed:

Dynamic update of filebeat configuration when new applications are deployed.

Ensuring every log file is rotated according to policy.

Extending filebeat with custom plugins for additional functionality.

Practical Implementation

To solve the challenges, the article recommends:

Use the fsnotify library (https://github.com/fsnotify/fsnotify) to watch the log directory for create/delete events and render new filebeat configuration files from templates.

Schedule log rotation with a cron job using the cron library (https://github.com/robfig/cron). The cron job runs logrotate with a configuration such as:

/var/log/xxxx/xxxxx.log {
    su www-data www-data
    missingok
    notifempty
    size 1G
    copytruncate
}

For custom filebeat development, refer to the official Filebeat repository: https://github.com/elastic/beats/tree/master/filebeat and the S6 project page: http://skarnet.org/software/s6/.

Summary

The solution provides a lightweight, scalable log collection pipeline for Kubernetes:

S6‑based containers redirect stdout to host files, eliminating daemon bottlenecks.

A node‑level DaemonSet runs filebeat + logrotate to collect and rotate logs.

Logs are streamed to Kafka, then to Elasticsearch via Logstash, and visualized in Kibana.

Reference links:

Kubernetes logging documentation: https://kubernetes.io/docs/concepts/cluster-administration/logging/

Understanding logrotate: https://support.rackspace.com/how-to/understanding-logrotate-utility/

Container log storage path
Container log storage path
Kubernetes log levels diagram
Kubernetes log levels diagram
Node agent architecture
Node agent architecture
Sidecar container logging
Sidecar container logging
Application log push
Application log push
Overall log pipeline
Overall log pipeline
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

DockerKuberneteslogrotateS6
dbaplus Community
Written by

dbaplus Community

Enterprise-level professional community for Database, BigData, and AIOps. Daily original articles, weekly online tech talks, monthly offline salons, and quarterly XCOPS&DAMS conferences—delivered by industry experts.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.