Cloud Native 8 min read

Building a Scalable Container Log System with S6, Filebeat, and Kafka

This article explains Docker and Kubernetes logging challenges, compares engine and container logs, shows why the Docker daemon becomes a bottleneck, and demonstrates a scalable solution using S6‑based images, Filebeat, logrotate, and a node‑agent architecture to collect, rotate, and forward logs to Kafka and Elasticsearch.

MaGe Linux Operations

Apr 16, 2021

Building a Scalable Container Log System with S6, Filebeat, and Kafka

Preparation

About container logs

Docker logs are divided into engine logs and container logs. Engine logs go to the system log; container logs are the output of applications inside containers. The docker logs command shows STDOUT and STDERR and stores them in

/var/lib/docker/containers/<container-id>/<container-id>-json.log

, which is unsuitable for production.

By default container logs have no size limit, causing disk exhaustion; log rotation can be configured via Docker log-driver.

Docker daemon becomes a bottleneck when collecting large volumes of logs.

Using docker logs -f can block the daemon, making commands like docker ps unresponsive.

Docker provides logging drivers that can be configured, but they still rely on the daemon, so collection speed remains a bottleneck.

log-driver collection speed syslog 14.9 MB/s json-file 37.9 MB/s

To avoid daemon collection, an S6‑based image can redirect logs directly to files with automatic rotation.

S6‑log redirects the CMD’s stdout to /…/default/current instead of the daemon, eliminating the performance bottleneck.

About k8s logs

Kubernetes log collection has three levels: application (Pod), node, and cluster.

Application (Pod) level – logs go to stdout/stderr and can be viewed with kubectl logs pod-name -n namespace.

Node level – managed by configuring a container log-driver together with logrotate.

Cluster level – typically uses a node‑agent (DaemonSet) or sidecar containers.

Node‑agent approach deploys a DaemonSet on each node; it consumes few resources and works when all container logs are stdout.

Sidecar containers can either stream the application’s stdout (creating duplicate log files) or run a dedicated log‑collector agent such as Logstash or Filebeat, though the latter consumes more resources and hides logs from kubectl logs.

Log architecture

A unified log collection system can use the node‑agent method: each container runs on an S6 base image, redirects logs to host directories like /data/logs/namespace/appname/podname/log/xxxx.log. A log‑agent containing Filebeat and logrotate collects these files, sends them to Kafka, which forwards them to Elasticsearch/Kibana via Logstash.

Key challenges:

Dynamic update of Filebeat configuration for newly deployed applications.

Ensuring each log file is properly rotated.

Extending Filebeat with custom features if needed.

Practical implementation

Deploy a log‑agent DaemonSet on every node. Use fsnotify to watch log directories and update Filebeat configs via templates. Use robfig/cron to schedule logrotate jobs, handling file ownership. Refer to external resources for Filebeat extension.

/var/log/xxxx/xxxxx.log {
    su www-data www-data
    missingok
    notifempty
    size 1G
    copytruncate
}

Conclusion

The article presents a simple approach to Kubernetes log collection; implementations should be adapted to specific company requirements.

References

https://kubernetes.io/docs/concepts/cluster-administration/logging/

https://support.rackspace.com/how-to/understanding-logrotate-utility/

https://github.com/elastic/beats/tree/master/filebeat

http://skarnet.org/software/s6/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Kubernetes Logging filebeat logrotate S6

Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.