Operations 12 min read

Mastering Container Log Collection in Kubernetes: Strategies and Best Practices

This article explains how container log collection in Kubernetes differs from traditional host logging, outlines common deployment methods such as DaemonSet and Sidecar, compares log storage options, and offers practical guidance on handling stdout and file‑based logs for reliable operations.

Efficient Ops
Efficient Ops
Efficient Ops
Mastering Container Log Collection in Kubernetes: Strategies and Best Practices

1. From Host to Container

In traditional VM or physical‑host environments, applications write logs directly to the host and operators simply deploy a log‑collection agent on each node. In Kubernetes the situation is more complex because pods are dynamically created, destroyed, and migrated, log storage can be stdout, hostPath, emptyDir, PV, etc., and collected logs must include Kubernetes metadata such as namespace, pod, container, node, labels, and environment variables.

Dynamic migration : Pods frequently move, so static per‑service agent configuration is impractical.

Log storage diversity : Containers may log to stdout, hostPath, emptyDir, PV, and other volumes.

Kubernetes metadata : Collected logs need injected metadata (namespace, pod, container, node, labels, env) for effective querying.

These requirements stem from the fact that traditional log collection does not integrate with Kubernetes and cannot perceive its runtime context.

2.1 Types of Logs

Cloud‑native best practices recommend outputting logs to stdout, but many workloads still write to files for reasons such as difficulty changing business‑side logging configurations or the need to separate audit, access, and other log categories.

2.2 Agent Deployment Methods

Two common ways to run a log‑collection agent in Kubernetes are:

DaemonSet – one agent per node.

Sidecar – an additional container in each pod runs the agent.

The trade‑offs are:

Resource usage : DaemonSet consumes one agent per node, while Sidecar adds an agent per pod, which can be costly on nodes with many pods.

Intrusiveness : Sidecar injects an agent into business pods, altering deployment semantics.

Stability : Sidecar failures (e.g., OOM) can impact the business container; many agents also increase connection load on downstream systems like Kafka.

Isolation : Sidecar isolates logs to its pod, whereas DaemonSet aggregates logs from all pods on the node.

Performance : Sidecar processes fewer logs, reducing the chance of hitting agent performance limits.

Tip: In most cases prefer DaemonSet; use Sidecar only for pods with exceptionally high log volume.

2.3 Collection Approaches

DaemonSet + Stdout

When Docker is the container runtime, each container’s stdout is stored in

/var/lib/docker/containers/{containerId}/{containerId}-json.log

. Prior to Kubernetes 1.14, kubelet created symlinks under

/var/log/pods

pointing to these files. Example directory tree (pre‑1.14):

<code>root@master0:/var/log/pods# tree .
|-- 6687e53201c01e3fad31e7d72fbb92a6
|   `-- kube-apiserver
|       |-- 865.log -> /var/lib/docker/containers/3a35ae0a1d0b26455fbd9b267cd9d6ac3fbd3f0b12ee03b4b22b80dc5a1cde03/3a35ae0a1d0b26455fbd9b267cd9d6ac3fbd3f0b12ee03b4b22b80dc5a1cde03-json.log
|       `-- 866.log -> /var/lib/docker/containers/15a6924f14fcbf15dd37d1c185c5b95154fa2c5f3de9513204b1066bbe474662/15a6924f14fcbf15dd37d1c185c5b95154fa2c5f3de9513204b1066bbe474662-json.log
|-- a1083c6d-3b12-11ea-9af1-fa163e28f309
|   `-- kube-proxy
|       |-- 3.log -> /var/lib/docker/containers/4b63b5a90a8f9ca6b6f20b49b5ab2564f92df21a5590f46de2a46b031e55c80e/4b63b5a90a8f9ca6b6f20b49b5ab2564f92df21a5590f46de2a46b031e55c80e-json.log
|       `-- 4.log -> /var/lib/docker/containers/fc7c315d33935887ca3479a38cfca4cca66fad782b8a120c548ad0b9f0ff7207/fc7c315d33935887ca3479a38cfca4cca66fad782b8a120c548ad0b9f0ff7207-json.log</code>

Since Kubernetes 1.14 the path changed to

/var/log/pods/&lt;namespace&gt;_&lt;pod_name&gt;_&lt;pod_id&gt;/&lt;container_name&gt;/&lt;num&gt;.log

. Example (post‑1.14):

<code>root@master-0:/var/log/pods# tree .
|-- kube-system_kube-apiserver-kind-control-plane_bd1c21fe1f0ef615e0b5e41299f1be61
|   `-- kube-apiserver
|       `-- 0.log
|-- kube-system_kube-proxy-gcrfq_f07260b8-6055-4c19-9491-4a825579528f
|   `-- kube-proxy
|       `-- 0.log
|-- loggie_loggie-csd4g_f1cc32e9-1002-4e64-bd58-fc6094394e06
|    `-- loggie
|        `-- 0.log</code>

A simple approach is to deploy a DaemonSet agent, mount

/var/log/pods

, and configure the agent to watch

/var/log/pod.log

(or a glob pattern) to collect all container stdout logs.

Tip: This was my initial solution, and I’m still a bit embarrassed about it!

DaemonSet + Log Files

If pods write log files instead of stdout, the agent must also access those files. Common volume types for exposing log files to the node are:

emptyDir : Lives with the pod; logs disappear when the pod is deleted. Path example:

/var/lib/kubelet/pods/${pod.UID}/volumes/kubernetes.io~empty-dir/${volumeName}

.

hostPath : Persists beyond pod lifetime; requires careful path management to avoid disk exhaustion. Use

subPathExpr

(enabled by default from Kubernetes 1.15) to create per‑pod sub‑directories.

PersistentVolume (PV) : Supports ReadWriteOnce, ReadOnlyMany, ReadWriteMany. Suitable for stateful services (StatefulSet) or shared logs across deployments, but adds operational complexity and may lack full agent support.

Regardless of the volume type, many agents still rely on a simple “mount‑and‑glob” strategy similar to the stdout case, inheriting the same limitations.

Some advanced agents (e.g., Filebeat, Fluentd) can inject namespace/pod metadata and handle various storage back‑ends, yet they do not fully resolve all challenges.

In summary, a DaemonSet collecting stdout is simple and works for many scenarios, but it cannot inject rich metadata, handle per‑service custom parsing, or avoid unnecessary log volume. File‑based collection via emptyDir, hostPath, or PV adds flexibility but also complexity and potential resource concerns.

operationsKuberneteslog collectionSidecarDaemonSetContainer Logging
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.