Cloud Native 9 min read

Building a Scalable Container Log Collection System with S6 and Filebeat

This article explains how to design and implement a unified log collection architecture for Docker containers and Kubernetes clusters using S6‑based images, Filebeat, logrotate, Kafka, Logstash, and Elasticsearch, addressing common challenges such as log rotation, daemon bottlenecks, and dynamic configuration.

Efficient Ops
Efficient Ops
Efficient Ops
Building a Scalable Container Log Collection System with S6 and Filebeat

Preparation

About Container Logs

Docker logs are divided into engine logs and container logs. Engine logs are handled by the system logger and stored in OS‑specific locations.

The article focuses on container logs, which are the output of applications running inside containers. By default, the command

docker logs

shows the STDOUT and STDERR of a running container, and the logs are stored in JSON‑file format at

/var/lib/docker/containers/<container_id>/<container_id>-json.log

. This approach is unsuitable for production.

Without size limits, container logs can fill the disk, causing system issues (log‑driver supports rotation).

Docker daemon collects container stdout; excessive log volume makes the daemon a bottleneck.

Using

docker logs -f

to follow logs can block the daemon, causing commands like

docker ps

to become unresponsive.

Docker provides configurable logging drivers; however, they still rely on the daemon for collection, so the speed bottleneck remains.

log‑driver collection speed syslog 14.9 MB/s json‑file 37.9 MB/s

To avoid daemon collection, one can redirect logs directly to files with automatic rotation using an S6‑based image.

S6‑log redirects the CMD's stdout to

/…/default/current

instead of sending it to the Docker daemon, eliminating the daemon bottleneck.

About k8s Logs

Kubernetes log collection is divided into three levels:

Application (Pod) level

Node level

Cluster level

At the Pod level, logs are output to stdout/stderr and can be viewed with

kubectl logs pod-name -n namespace

.

At the Node level, logs are managed via container log‑drivers, often combined with

logrotate

for automatic rotation.

At the Cluster level, three approaches are common:

Node‑agent method : Deploy a DaemonSet on each node that runs a log‑agent (e.g., Filebeat) to collect logs locally. This method uses few resources and is non‑intrusive but only works for standard‑output logs.

Sidecar container as a log proxy: Each pod runs an additional container that forwards logs. One variant streams logs to stdout, creating duplicate log files on the host; the other runs a full log collector (e.g., Logstash, Fluentd) inside the pod, which consumes more resources and hides logs from

kubectl logs

.

Application‑direct logging: The application itself pushes logs to a backend service.

Log Architecture

A unified log collection system can be built using the node‑agent approach. The overall architecture is:

All application containers are built from an S6‑based image; their logs are redirected to host files such as

/data/logs/namespace/appname/podname/log/xxxx.log

.

A log‑agent container on each node runs Filebeat, logrotate, etc., to collect these files.

Filebeat forwards the logs to Kafka.

Kafka delivers logs to Elasticsearch for storage and Kibana for querying.

Logstash consumes Kafka messages and creates indices in Elasticsearch.

Key challenges to address:

Dynamically updating Filebeat configuration for newly deployed applications.

Ensuring every log file is properly rotated.

Extending Filebeat with custom features when needed.

Implementation

To solve the challenges, develop a

log‑agent

DaemonSet that runs on every node and includes Filebeat, logrotate, and custom components.

For dynamic Filebeat configuration, use the

fsnotify

library to watch the log directory for create/delete events and render configuration templates accordingly.

For log rotation, use the

cron

library to schedule periodic rotation jobs, taking care of file ownership (e.g., non‑root users).

<code>/var/log/xxxx/xxxxx.log {
      su www-data www-data
      missingok
      notifempty
      size 1G
      copytruncate
    }</code>

For extending Filebeat, refer to community guides and source code.

Conclusion

This article provides a simple reference design for Kubernetes log collection; actual implementations should be tailored to specific organizational requirements.

References

Kubernetes Logging Documentation

Understanding logrotate Utility

Filebeat Repository

S6 Process Supervision Suite

dockerkubernetesloggingFilebeatlogrotateS6
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.