How to Build a Scalable Container Log Collection System with S6 and Filebeat
This article explains Docker and Kubernetes logging challenges, compares logging drivers, introduces S6‑based container logging, and presents a node‑level log‑agent architecture using Filebeat, Logrotate, Kafka, and Elasticsearch to achieve reliable, auto‑rotating log collection in production environments.
Preparation
About container logs
Docker logs are divided into engine logs and container logs. Engine logs are handled by the system logger, while container logs are the output of applications running inside containers. By default, docker logs shows STDOUT and STDERR of a running container, stored in JSON‑file format at
/var/lib/docker/containers/<container_id>/<container_id>-json.log, which is unsuitable for production.
Without size limits, container logs can fill the disk, and the Docker daemon becomes a bottleneck when collecting large volumes.
Using docker logs -f can block the daemon, causing commands like docker ps to become unresponsive.
Docker provides logging drivers that can be configured, but they still rely on the daemon, so collection speed remains a bottleneck.
log‑driver collection speed syslog 14.9 MB/s json‑file 37.9 MB/s
To avoid daemon‑based collection, an S6‑based image can redirect container output directly to files with automatic rotation.
About Kubernetes logs
Kubernetes logging is organized into three levels:
Application (Pod) level – logs are written to STDOUT/STDERR and can be viewed with kubectl logs pod-name -n namespace.
Node level – managed by configuring a log‑driver and using tools like logrotate.
Cluster level – typically implemented with a node‑agent (DaemonSet) that runs on each node.
Node‑agent approaches include:
Node proxy : a DaemonSet that collects logs from the host, low resource usage, but requires all container logs to be on STDOUT.
Sidecar containers can also act as log agents:
Streaming sidecar: forwards container stdout/stderr, creating duplicate log files on the host.
Log‑collector sidecar (e.g., Logstash, Filebeat): runs a dedicated agent inside the pod, consuming more resources and hiding logs from kubectl logs.
Directly pushing logs from the application to a backend service is another simple option.
Log Architecture
A unified log collection system can use a node‑agent that runs as a DaemonSet. The flow is:
All application containers are built from an S6 base image, redirecting logs to host files such as /data/logs/namespace/appname/podname/log/xxxx.log.
The log‑agent contains Filebeat and Logrotate; Filebeat harvests the log files.
Filebeat forwards logs to Kafka.
Kafka delivers logs to Elasticsearch, where Kibana provides search and visualization.
Logstash creates indices in Elasticsearch and consumes Kafka messages.
Key challenges:
Dynamically updating Filebeat configuration for newly deployed applications.
Ensuring each log file is rotated properly.
Extending Filebeat with custom features when needed.
Practical Implementation
Deploy a log‑agent as a DaemonSet on every cluster node. Inside the agent run Filebeat, Logrotate, and custom components.
To update Filebeat configs dynamically, watch the log directory with github.com/fsnotify/fsnotify and render new configuration files from templates.
To rotate logs regularly, create a CronJob using github.com/robfig/cron that runs Logrotate with appropriate user permissions.
/var/log/xxxx/xxxxx.log {
su www-data www-data
missingok
notifempty
size 1G
copytruncate
}For custom Filebeat development, refer to community blogs and the official Filebeat repository.
Conclusion
The article provides a simple blueprint for Kubernetes log collection; actual implementations should be adapted to specific company requirements.
References
https://kubernetes.io/docs/concepts/cluster-administration/logging/
https://support.rackspace.com/how-to/understanding-logrotate-utility/
https://github.com/elastic/beats/tree/master/filebeat
http://skarnet.org/software/s6/
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Linux
Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
