Cloud Native 9 min read

Building a Scalable Container Log Collection System with S6 and Filebeat

This article explains how to design and implement a unified log collection architecture for Docker containers and Kubernetes clusters using S6‑based images, Filebeat, logrotate, Kafka, Logstash, and Elasticsearch, addressing common challenges such as log rotation, daemon bottlenecks, and dynamic configuration.

Efficient Ops

Mar 28, 2021

Building a Scalable Container Log Collection System with S6 and Filebeat

Preparation

About Container Logs

Docker logs are divided into engine logs and container logs. Engine logs are handled by the system logger and stored in OS‑specific locations.

The article focuses on container logs, which are the output of applications running inside containers. By default, the command docker logs shows the STDOUT and STDERR of a running container, and the logs are stored in JSON‑file format at

/var/lib/docker/containers/<container_id>/<container_id>-json.log

. This approach is unsuitable for production.

Without size limits, container logs can fill the disk, causing system issues (log‑driver supports rotation).

Docker daemon collects container stdout; excessive log volume makes the daemon a bottleneck.

Using docker logs -f to follow logs can block the daemon, causing commands like docker ps to become unresponsive.

Docker provides configurable logging drivers; however, they still rely on the daemon for collection, so the speed bottleneck remains.

log‑driver collection speed syslog 14.9 MB/s json‑file 37.9 MB/s

To avoid daemon collection, one can redirect logs directly to files with automatic rotation using an S6‑based image.

S6‑log redirects the CMD's stdout to /…/default/current instead of sending it to the Docker daemon, eliminating the daemon bottleneck.

About k8s Logs

Kubernetes log collection is divided into three levels:

Application (Pod) level

Node level

Cluster level

At the Pod level, logs are output to stdout/stderr and can be viewed with kubectl logs pod-name -n namespace.

At the Node level, logs are managed via container log‑drivers, often combined with logrotate for automatic rotation.

At the Cluster level, three approaches are common:

Node‑agent method : Deploy a DaemonSet on each node that runs a log‑agent (e.g., Filebeat) to collect logs locally. This method uses few resources and is non‑intrusive but only works for standard‑output logs.

Sidecar container as a log proxy: Each pod runs an additional container that forwards logs. One variant streams logs to stdout, creating duplicate log files on the host; the other runs a full log collector (e.g., Logstash, Fluentd) inside the pod, which consumes more resources and hides logs from kubectl logs.

Application‑direct logging: The application itself pushes logs to a backend service.

Log Architecture

A unified log collection system can be built using the node‑agent approach. The overall architecture is:

All application containers are built from an S6‑based image; their logs are redirected to host files such as /data/logs/namespace/appname/podname/log/xxxx.log.

A log‑agent container on each node runs Filebeat, logrotate, etc., to collect these files.

Filebeat forwards the logs to Kafka.

Kafka delivers logs to Elasticsearch for storage and Kibana for querying.

Logstash consumes Kafka messages and creates indices in Elasticsearch.

Key challenges to address:

Dynamically updating Filebeat configuration for newly deployed applications.

Ensuring every log file is properly rotated.

Extending Filebeat with custom features when needed.

Implementation

To solve the challenges, develop a log‑agent DaemonSet that runs on every node and includes Filebeat, logrotate, and custom components.

For dynamic Filebeat configuration, use the fsnotify library to watch the log directory for create/delete events and render configuration templates accordingly.

For log rotation, use the cron library to schedule periodic rotation jobs, taking care of file ownership (e.g., non‑root users).

/var/log/xxxx/xxxxx.log {
      su www-data www-data
      missingok
      notifempty
      size 1G
      copytruncate
    }

For extending Filebeat, refer to community guides and source code.

Conclusion

This article provides a simple reference design for Kubernetes log collection; actual implementations should be tailored to specific organizational requirements.

References

Kubernetes Logging Documentation

Understanding logrotate Utility

Filebeat Repository

S6 Process Supervision Suite

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Docker Kubernetes Logging filebeat logrotate S6

Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.