Cloud Native 8 min read

Building a Scalable Container Log System with S6, Filebeat, and Kafka

This article explains Docker and Kubernetes logging challenges, compares engine and container logs, shows why the Docker daemon becomes a bottleneck, and demonstrates a scalable solution using S6‑based images, Filebeat, logrotate, and a node‑agent architecture to collect, rotate, and forward logs to Kafka and Elasticsearch.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Building a Scalable Container Log System with S6, Filebeat, and Kafka

Preparation

About container logs

Docker logs are divided into engine logs and container logs. Engine logs go to the system log; container logs are the output of applications inside containers. The docker logs command shows STDOUT and STDERR and stores them in

/var/lib/docker/containers/<container-id>/<container-id>-json.log

, which is unsuitable for production.

By default container logs have no size limit, causing disk exhaustion; log rotation can be configured via Docker log-driver.

Docker daemon becomes a bottleneck when collecting large volumes of logs.

Using docker logs -f can block the daemon, making commands like docker ps unresponsive.

Docker provides logging drivers that can be configured, but they still rely on the daemon, so collection speed remains a bottleneck.

log-driver collection speed syslog 14.9 MB/s json-file 37.9 MB/s

To avoid daemon collection, an S6‑based image can redirect logs directly to files with automatic rotation.

S6‑log redirects the CMD’s stdout to /…/default/current instead of the daemon, eliminating the performance bottleneck.

About k8s logs

Kubernetes log collection has three levels: application (Pod), node, and cluster.

Application (Pod) level – logs go to stdout/stderr and can be viewed with kubectl logs pod-name -n namespace.

Node level – managed by configuring a container log-driver together with logrotate.

Cluster level – typically uses a node‑agent (DaemonSet) or sidecar containers.

Node‑agent approach deploys a DaemonSet on each node; it consumes few resources and works when all container logs are stdout.

Sidecar containers can either stream the application’s stdout (creating duplicate log files) or run a dedicated log‑collector agent such as Logstash or Filebeat, though the latter consumes more resources and hides logs from kubectl logs.

Log architecture

A unified log collection system can use the node‑agent method: each container runs on an S6 base image, redirects logs to host directories like /data/logs/namespace/appname/podname/log/xxxx.log. A log‑agent containing Filebeat and logrotate collects these files, sends them to Kafka, which forwards them to Elasticsearch/Kibana via Logstash.

Key challenges:

Dynamic update of Filebeat configuration for newly deployed applications.

Ensuring each log file is properly rotated.

Extending Filebeat with custom features if needed.

Practical implementation

Deploy a log‑agent DaemonSet on every node. Use fsnotify to watch log directories and update Filebeat configs via templates. Use robfig/cron to schedule logrotate jobs, handling file ownership. Refer to external resources for Filebeat extension.

/var/log/xxxx/xxxxx.log {
    su www-data www-data
    missingok
    notifempty
    size 1G
    copytruncate
}

Conclusion

The article presents a simple approach to Kubernetes log collection; implementations should be adapted to specific company requirements.

References

https://kubernetes.io/docs/concepts/cluster-administration/logging/

https://support.rackspace.com/how-to/understanding-logrotate-utility/

https://github.com/elastic/beats/tree/master/filebeat

http://skarnet.org/software/s6/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

KubernetesloggingFilebeatlogrotateS6
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.