Cloud Native 14 min read

Mastering Kubernetes Logging: Practical Tips for Levels, Formats, and Performance

This article provides a hands‑on guide to building a reliable Kubernetes logging system, covering log level selection, content standards, output formats, volume control, multiple targets, performance impact, library choices, storage options, and long‑term retention strategies.

Alibaba Cloud Native

Mar 3, 2020

Mastering Kubernetes Logging: Practical Tips for Levels, Formats, and Performance

The author shares years of experience building a logging system for Kubernetes, aiming to help readers avoid common pitfalls and establish a practical, standardized logging pipeline.

1. How to Choose Log Levels

Kubernetes applications should use six standard log levels, each indicating severity:

FATAL – critical errors requiring immediate human intervention.

ERROR – unexpected errors that may affect parts of the system but not core functionality.

WARN – potentially dangerous conditions worth attention.

INFO – detailed execution flow for each request.

DEBUG – verbose debugging information; should be disabled in production.

TRACE – the most granular trace data, often including payloads.

Practical advice includes using FATAL only for unrecoverable errors, treating ERROR as alert‑worthy while WARN can be non‑alerting, limiting production logs to INFO or WARN, enabling DEBUG temporarily for troubleshooting, and ensuring the logging library can change levels at runtime.

2. Log Content Standards

Every log entry should contain at least Time , Level , and Location . Additional fields depend on the module or business context, such as:

TraceID when using distributed tracing.

Business identifiers like order ID or user ID.

HTTP request details: URL, Method, Status, Latency, Inflow, OutFlow, ClientIP, UserAgent.

Module name when multiple components share the same log stream.

These conventions should be enforced by the operations platform to ensure uniformity.

3. Log Representation

Key‑Value pairs are recommended for easy parsing, e.g.:

[2019-12-30 21:45:30.611992]    [WARNING]    [958] [block_writer.cpp:671] path:pangu:/localcluster/index/3/prom/7/1577711464522767696_0_1577711517    min_time:1577712000000000    max_time:1577715600000000    normal_count:27595    config:prom    start_line:57315569    end_line:57343195    latency(ms):42    type:AddBlock

JSON is also acceptable and widely supported by log collectors:

{"addr":"tcp://0.0.0.0:10010","caller":"main.go:98","err":"listen tcp: address tcp://0.0.0.0:10010: too many colons in address","level":"error","msg":"Failed to listen","ts":"2019-03-08T10:02:47.469421Z"}

Avoid binary or protobuf formats for most scenarios.

4. Single‑Line Log Entries

Do not split a single logical log into multiple lines; multi‑line logs increase collection, parsing, and indexing costs.

5. Controlling Log Output Volume

Excessive logs waste disk space and CPU. Recommendations:

Collect request/response logs for every entry point unless a special reason excludes them.

Print error logs; if they become noisy, apply sampling.

Minimize logs inside tight loops.

Limit ingress/Nginx access logs to ≤5 MB/s (≈500 B per line, ≤10 k lines/s) and application logs to ≤200 KB/s (≈2 KB per line, ≤100 lines/s).

6. Multiple Log Output Targets

Separate different log types into distinct files to simplify collection and monitoring:

Access logs per domain.

Error logs with dedicated alerting.

External‑system call logs for audit.

Middleware logs usually managed by a unified platform.

7. Controlling Log Performance Overhead

Logging must not degrade business performance. Test the logging library so that its CPU consumption stays below 5 % of total usage, and ensure logging is asynchronous to avoid blocking the main workflow.

8. Choosing a Log Library

Popular, stable libraries per language include:

Java – Log4J, LogBack.

Go – go‑kit.

Python – built‑in logging (refer to the CookBook).

C++ – spdlog (high‑performance, cross‑platform).

9. Log Shape: File vs. Stdout

Containers typically write to stdout / stderr, which Docker captures. This works for simple system components but not for complex services with multiple layers; mixing everything into stdout makes separation difficult and can consume a full CPU core at 100 k logs/s.

10. Persistence and Storage Medium

Logs can be sent directly to a centralized system without persisting to disk, which reduces latency but is suitable only for very high‑volume scenarios. For most cases, write to local storage (HostVolume or EmptyDir) to provide a buffer for network failures and enable direct file inspection when the log system is unavailable.

11. Ensuring Log Retention

Kubernetes dynamically creates and destroys nodes and containers, causing logs to disappear. To retain logs for DevOps, audit, or compliance, centralize collection so that logs are captured within seconds and stored independently of the lifecycle of the originating pod.

In summary, adopting a unified logging specification across teams ensures that downstream collection, analysis, monitoring, and visualization can operate smoothly and reliably.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Observability Kubernetes best practices logging log levels log format

Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.