How to Build a Scalable Container Log Collection System with S6 and Filebeat
This article explains Docker and Kubernetes container logging fundamentals, highlights the limitations of default json‑file logging, and presents a unified log‑collection architecture using S6‑based images, filebeat, logrotate, Kafka, and Elasticsearch, with practical steps for dynamic configuration and log rotation in a k8s cluster.
Preparation
About container logs
Docker logs are divided into engine logs and container logs. Engine logs are handled by the system logger, while container logs are the output of applications running inside the container.
Container logs can be viewed with
docker logs, which shows STDOUT and STDERR. By default they are stored as JSON files at
/var/lib/docker/containers/<container_id>/<container_id>-json.log, a format unsuitable for production.
Without size limits, log files can grow indefinitely and fill the disk (log‑driver supports rotation).
Heavy log volume can make the Docker daemon a bottleneck for log collection.
Using
docker logs -fto follow logs can block the daemon, causing commands like
docker psto become unresponsive.
Docker provides logging drivers that can be configured, but they still rely on the daemon, so collection speed remains a bottleneck.
log‑driver throughput: syslog 14.9 MB/s json‑file 37.9 MB/s
To avoid daemon‑based collection, an S6‑based image can redirect logs directly to files with automatic rotation.
S6‑log redirects the CMD’s stdout to a file (e.g.,
/…/default/current) instead of the Docker daemon, eliminating the daemon’s performance bottleneck.
About k8s logs
Kubernetes log collection can be divided into three levels: application (Pod) level, node level, and cluster level.
Application (Pod) level – logs are written to stdout/stderr and can be viewed with
kubectl logs pod-name -n namespace.
Node level – managed by configuring the container’s log‑driver, often combined with logrotate.
Cluster level – implemented via node agents, sidecar containers, or direct log shipping from the application.
Node‑level agents (DaemonSet) run on each node, consuming few resources and requiring only standard output from containers.
Sidecar containers can act as log agents. One approach streams the application’s logs to stdout, creating duplicate log files on the host.
Another runs a full log‑collection agent (e.g., Logstash or Fluentd) inside each pod, which consumes more resources and hides logs from
kubectl logs.
Directly pushing logs from the application to a backend storage service is also possible.
Log Architecture
A unified log‑collection system can use a node‑agent approach. All application containers are built from an S6‑based image, redirecting logs to host directories such as
/data/logs/namespace/appname/podname/log/xxxx.log. The log‑agent includes Filebeat and logrotate; Filebeat forwards logs to Kafka, which then feeds Elasticsearch/Kibana via Logstash.
Application containers write logs to host files.
Log‑agent (Filebeat, logrotate) collects the files.
Filebeat sends logs to Kafka.
Kafka forwards logs to Elasticsearch for storage and Kibana for search.
Logstash creates indices and consumes Kafka messages.
Key challenges include dynamically updating Filebeat configuration for new applications, ensuring proper log rotation, and extending Filebeat with custom features.
Practical Implementation
Deploy a log‑agent DaemonSet on each Kubernetes node containing Filebeat, logrotate, and custom components.
To update Filebeat configuration dynamically, watch the log directory with
github.com/fsnotify/fsnotifyand render templates for the config files.
For periodic log rotation, use
github.com/robfig/cronto schedule a cron job that runs logrotate with appropriate permissions.
<code>/var/log/xxxx/xxxxx.log {
su www-data www-data
missingok
notifempty
size 1G
copytruncate
}</code>For extending Filebeat, refer to community guides and source code.
Conclusion
This article provides a simple approach to Kubernetes log collection; implementations can be adapted to specific company requirements.
References
Kubernetes logging documentation
Understanding logrotate utility
Filebeat repository
S6 project
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.