Mastering Observability Data Collection in Kubernetes with iLogtail
This article explains the types and value of observability data, the characteristics and data‑collection requirements of Kubernetes deployments, the challenges of large‑scale log, metric and trace ingestion, and practical solutions such as DaemonSet vs Sidecar, ConfigServer, CRD automation, serverless support, and eBPF integration.
Observability Data Types and Value
Observability originated in the electrical field and means a system’s state can be inferred from external outputs. In IT, observability data are divided into three categories—logs, traces, and metrics—providing insight for fault localization and performance optimization.
As software systems become more complex and distributed, teams need observable data to understand the whole system, improve cross‑team communication, and turn black‑box components into white‑box ones, especially in cloud‑native environments.
IT System Observability Scenarios and Evolution
Beyond operations, observable data are used in mobile app analytics, retail foot‑traffic monitoring, and traffic management. With the shift to Kubernetes (K8s), the article introduces deployment characteristics and data‑collection needs.
K8s Business Deployment Characteristics and Data‑Collection Requirements
Automatic Scheduling and Elastic Scaling
K8s enables declarative deployment, automatic scheduling, and elastic scaling, accelerating version iteration and increasing container churn, which raises the demand for fast, reliable data collection.
Resource Abstraction and Mixed Use
K8s abstracts heterogeneous resources, allowing Linux, Windows, GPU, and virtual nodes to be managed in a single cluster, maximizing utilization and supporting hybrid‑cloud scenarios.
Storage Abstraction and Flexible Orchestration
K8s abstracts storage, letting applications declare storage type, capacity, and I/O requirements without handling low‑level details, enabling data persistence and compute‑storage separation.
Common Challenges of Observability Data Collection in K8s
Complex Deployment and Management
High container density and rapid lifecycle changes make it difficult to deploy a single collector that can handle heterogeneous workloads and avoid data loss.
Diverse Runtime Environments
Multiple container runtimes (Docker, containerd, CRI‑O) and node types (physical, VM, virtual) require collectors to adapt to different file locations and communication mechanisms.
Large Log Volume per Node
Mixed deployments can lead to dozens of containers per node.
Disk I/O limits differ between local HDD/SSD and cloud disks.
Storage expansion via PVCs removes traditional bottlenecks.
Some workloads generate >200 MB/s of logs on a single node.
Heterogeneous Observability Data
Logs, metrics, and traces come from various sources such as application logs, MySQL binlogs, Nginx access logs, Prometheus metrics, and SkyWalking traces, requiring a collector that supports multiple data types.
Solutions and Practices
Collection Deployment
Deployment Modes
Two common modes in K8s: DaemonSet (one collector per node) and Sidecar (collector co‑located with the business container). DaemonSet offers low coupling and cost‑effectiveness, while Sidecar provides isolation for high‑volume containers.
Configuration Distribution
ConfigMap distribution is limited by size and flexibility. Using a centralized ConfigServer enables graphical management, grouping, and supports thousands of configurations per node.
Automation via CRD
Custom Resource Definitions (CRDs) allow declarative configuration delivery, integrating seamlessly with CI/CD pipelines without extra development effort.
Runtime Adaptation
Container Runtime Support
iLogtail discovers containers by communicating directly with the local runtime, extracting overlay and mount information to collect files without shared volumes, and adapts to Docker, containerd, and CRI‑O differences.
Serverless Support
For virtual nodes (e.g., Alibaba Cloud ECI), iLogtail runs as a hidden sidecar (“hidecar”) alongside the business container, receiving static metadata to collect logs similarly to DaemonSet.
Scaling Collection Capacity
Single Collector Optimization
iLogtail can achieve up to 440 MB/s in minimal mode; performance can be improved by increasing resources, tuning parallelism, and ensuring sufficient network bandwidth.
Multiple Collectors
Deploying additional Sidecar collectors on a node can split high‑throughput logs across multiple instances, avoiding single‑collector bottlenecks.
Supporting Heterogeneous Data
Plugin Framework
iLogtail’s plugin architecture enables extensions for various inputs (Binlog, Syslog, eBPF) and outputs (Kafka, ClickHouse, GRPC), allowing flexible data pipelines.
eBPF‑Based Non‑Intrusive Collection
eBPF probes capture system‑call data in the kernel, which is filtered and enriched with user‑space metadata (process, container) before being sent to the backend, providing language‑agnostic trace and metric collection.
Open‑Source Future
iLogtail is now open source, with planned enhancements in ecosystem expansion (Kafka, OTLP, ClickHouse), framework improvements, deeper eBPF protocol support, and global configuration management via operators.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.