Operations 9 min read

How to Build a Scalable Container Log Collection System with S6 and Filebeat

This article explains Docker and Kubernetes container logging fundamentals, highlights the limitations of default json‑file logging, and presents a unified log‑collection architecture using S6‑based images, filebeat, logrotate, Kafka, and Elasticsearch, with practical steps for dynamic configuration and log rotation in a k8s cluster.

Efficient Ops
Efficient Ops
Efficient Ops
How to Build a Scalable Container Log Collection System with S6 and Filebeat

Preparation

About container logs

Docker logs are divided into engine logs and container logs. Engine logs are handled by the system logger, while container logs are the output of applications running inside the container.

Container logs can be viewed with

docker logs

, which shows STDOUT and STDERR. By default they are stored as JSON files at

/var/lib/docker/containers/<container_id>/<container_id>-json.log

, a format unsuitable for production.

Without size limits, log files can grow indefinitely and fill the disk (log‑driver supports rotation).

Heavy log volume can make the Docker daemon a bottleneck for log collection.

Using

docker logs -f

to follow logs can block the daemon, causing commands like

docker ps

to become unresponsive.

Docker provides logging drivers that can be configured, but they still rely on the daemon, so collection speed remains a bottleneck.

log‑driver throughput: syslog 14.9 MB/s json‑file 37.9 MB/s

To avoid daemon‑based collection, an S6‑based image can redirect logs directly to files with automatic rotation.

S6‑log redirects the CMD’s stdout to a file (e.g.,

/…/default/current

) instead of the Docker daemon, eliminating the daemon’s performance bottleneck.

About k8s logs

Kubernetes log collection can be divided into three levels: application (Pod) level, node level, and cluster level.

Application (Pod) level – logs are written to stdout/stderr and can be viewed with

kubectl logs pod-name -n namespace

.

Node level – managed by configuring the container’s log‑driver, often combined with logrotate.

Cluster level – implemented via node agents, sidecar containers, or direct log shipping from the application.

Node‑level agents (DaemonSet) run on each node, consuming few resources and requiring only standard output from containers.

Sidecar containers can act as log agents. One approach streams the application’s logs to stdout, creating duplicate log files on the host.

Another runs a full log‑collection agent (e.g., Logstash or Fluentd) inside each pod, which consumes more resources and hides logs from

kubectl logs

.

Directly pushing logs from the application to a backend storage service is also possible.

Log Architecture

A unified log‑collection system can use a node‑agent approach. All application containers are built from an S6‑based image, redirecting logs to host directories such as

/data/logs/namespace/appname/podname/log/xxxx.log

. The log‑agent includes Filebeat and logrotate; Filebeat forwards logs to Kafka, which then feeds Elasticsearch/Kibana via Logstash.

Application containers write logs to host files.

Log‑agent (Filebeat, logrotate) collects the files.

Filebeat sends logs to Kafka.

Kafka forwards logs to Elasticsearch for storage and Kibana for search.

Logstash creates indices and consumes Kafka messages.

Key challenges include dynamically updating Filebeat configuration for new applications, ensuring proper log rotation, and extending Filebeat with custom features.

Practical Implementation

Deploy a log‑agent DaemonSet on each Kubernetes node containing Filebeat, logrotate, and custom components.

To update Filebeat configuration dynamically, watch the log directory with

github.com/fsnotify/fsnotify

and render templates for the config files.

For periodic log rotation, use

github.com/robfig/cron

to schedule a cron job that runs logrotate with appropriate permissions.

<code>/var/log/xxxx/xxxxx.log {
    su www-data www-data
    missingok
    notifempty
    size 1G
    copytruncate
}</code>

For extending Filebeat, refer to community guides and source code.

Conclusion

This article provides a simple approach to Kubernetes log collection; implementations can be adapted to specific company requirements.

References

Kubernetes logging documentation

Understanding logrotate utility

Filebeat repository

S6 project

dockerkubernetesloggingFilebeatlogrotateS6
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.