A Kernel‑Embedded Lens: Cloud Monitor 2.0 Enables Full‑Stack Observability Without Code Changes
OpenTelemetry eBPF Instrumentation (OBI) embeds a kernel‑level, zero‑code probe that automatically captures network traffic, RPC, database, message‑queue and GPU operations across Go, Java, Python, Node.js and .NET, generating standard OpenTelemetry traces and metrics without modifying application code.
In cloud‑native and micro‑service environments, production systems span multiple runtimes (Go, Java, Python, Node.js, .NET) and deployment forms (containers, Kubernetes, Serverless). Traditional observability requires language‑specific agents or SDKs, which means code changes, package upgrades, and redeployments for every new service. This “instrument‑then‑modify” model is increasingly unsustainable for rapid development cycles.
Zero‑Code, Cross‑Language Observability with OBI
OpenTelemetry eBPF Instrumentation (OBI) provides a kernel‑level solution that requires no application changes. By attaching sandboxed eBPF probes to the Linux kernel, OBI intercepts network traffic, library calls, and system calls for any process, automatically producing OpenTelemetry‑compatible trace and metric data. It supports more than 15 protocols, including HTTP, gRPC, MySQL, Redis, Kafka, and CUDA, and it natively recognises GenAI providers such as OpenAI, Anthropic, Google Gemini, and Qwen.
Three Observability Pillars
Application Observability : Distributed tracing (Traces) + RED metrics, covering web, databases, message queues, GenAI and GPU. Logs are transparently injected with trace_id / span_id for trace‑log correlation.
Network Observability : L3/L4 traffic monitoring, TCP/UDP statistics, GeoIP, reverse DNS, CIDR tagging, RTT measurement, connection‑failure counts, and node‑level global metrics.
Log Enrichment : Language‑agnostic injection of trace_id and span_id into JSON logs.
Protocol‑Aware Detection
OBI identifies protocols without decryption or port assumptions using a three‑stage matching algorithm in ReadTCPRequestIntoSpan (pkg/ebpf/common/tcp_detect_transform.go):
Kernel‑assigned protocol : Fast path using constants such as MySQL=1, Postgres=2, Kafka=4, etc.
Deterministic generic match : Sequential checks like matchSQL → matchFastCGI → matchMongo → …, with special handling for HTTP/2 vs gRPC (RFC 7540 frame validation).
Heuristic fallback : Last‑resort patterns (e.g., detectHeuristicProtocol) ordered to avoid false positives (HTTP/2 must be checked before MQTT).
Each protocol has a dedicated TCPTo<Protocol>ToSpan constructor that creates a request span.
Deep Language Integration
Go : Since goroutine IDs are not stored in thread‑local storage, OBI reconstructs parent‑child relationships in the kernel via bpf/gotracer/go_runtime.c and go_common.h. It records runtime.newproc1 to build an LRU of ongoing goroutines, then walks the parent chain up to six levels to associate outbound calls with the originating inbound request.
Python asyncio : OBI tracks asyncio.Task objects and their context. Four uprobe hooks ( task_step, _asyncio_Task___init__, PyContext_CopyCurrent, context_run) rebuild the task hierarchy and propagate trace context across thread‑pool executions. Three BPF maps ( python_thread_state, python_task_state, python_context_task) store task metadata and handle versioned references to avoid stale contexts.
Cross‑Process Propagation for Non‑Go Languages
While intra‑process propagation is language‑specific, OBI implements a unified kernel‑mode trace‑parent injector ( tpinjector) for all other runtimes. It supports three mechanisms:
HTTP/1 header injection via sk_msg (adds Traceparent header).
HTTP/2 HPACK injection using a Huffman fingerprint ( traceparent encoded in HPACK).
Custom TCP option (kind = 25) to embed trace_id/span_id in the payload ( WRITE_HDR_OPT / bpf_store_hdr_opt).
The propagation mode is controlled by the OTEL_EBPF_BPF_CONTEXT_PROPAGATION environment variable (values: headers, tcp, all, disabled).
Deployment Options
OBI runs on Linux kernels 5.8+ (RHEL 4.18+ with backport). It can be launched as a standalone process, a Docker container, or a Kubernetes DaemonSet. Users must verify that the custom TCP option (kind = 25) traverses firewalls or load balancers; otherwise, header injection is recommended.
Pipeline Architecture
The data flow is expressed as a directed acyclic graph (DAG) orchestrated by the internal swarm framework. The pipeline consists of three independent agents (application tracing, network tracing, log enrichment) managed by an errgroup. If any agent fails, the entire pipeline is cancelled, ensuring graceful shutdown.
Inter‑agent communication uses a generic, deadlock‑detecting fan‑out queue msg.Queue[T] (pkg/pipe/msg/queue.go). Features include:
Fan‑out : One queue can have multiple subscribers; messages are broadcast without blocking when there are no subscribers.
Bypass : Disabled nodes are physically removed from the graph via input.Bypass(output).
Deadlock detection : SendCtx monitors send timeouts (default 1 min) and panics if a path is blocked.
Multi‑producer close handling : ClosingAttempts(n) and MarkCloseable() ensure the queue closes only after all producers signal completion.
The kernel‑to‑user‑space handoff uses a double‑goroutine ring‑buffer forwarder ( ringBufForwarder[T]) with an object pool sized 2 × BatchLength. One goroutine ( readerLoop) reads raw records from the eBPF ring buffer via ReadInto; a second goroutine ( parserLoop) parses records into spans, coordinating via freeIdx and workIdx channels. Batch flushing occurs when BatchLength (default 100) is reached or after BatchTimeout (default 1 s). A periodic flushOnAvailableBytes ensures low‑traffic data does not stall.
GPU / CUDA Tracing
When OTEL_EBPF_CUDA_MODE=auto is set, OBI detects the presence of libcuda.so and attaches uprobe hooks to capture kernel launches, graph launches, memory allocations, and copies. These events share the same generic ring‑buffer forwarder, so GPU spans are processed with identical batching, back‑pressure, and graceful shutdown logic as network spans.
Log Enrichment
OBI’s log enhancer hooks tty_write and pipe_write (bpf/logenricher.c). When a process writes to stdout/stderr or a pipe, the BPF program reads up to 8 KB of the original log line, fetches the current trace context, overwrites the user buffer with zeros via bpf_probe_write_user, and forwards the event through a dedicated ring buffer. The user‑space handler ( pkg/internal/ebpf/logenricher) parses JSON logs, injects trace_id and span_id (if absent), and writes the enriched line back to the original file descriptor. Non‑JSON lines are passed through unchanged. The implementation caches file paths (TTY vs pipe) to avoid repeated open calls.
Limitations: bpf_probe_write_user creates a tiny window where logs could be lost if the process crashes, and kernels with lockdown enabled may block this operation.
Using OBI in Cloud Monitor 2.0
Cloud Monitor 2.0 (CMS 2.0) is a unified observability platform built on OpenTelemetry that converges Metrics, Traces, Logs, Profiles, and Events. OBI bridges the “last mile” by providing zero‑code, cross‑language instrumentation for existing workloads, making them instantly observable in CMS 2.0. Users can enable OBI via the CMS 2.0 integration center with a single click, after which they can view request counts, error rates, latency, full trace graphs, network flow charts, and GPU activity without modifying application code.
Typical use cases include:
Security audit: Detect unexpected external IPs communicating heavily with databases.
Cross‑AZ traffic spikes: Identify routing misconfigurations that cause traffic to surge between availability zones.
Service dependency discovery: Auto‑generated service‑level topology for rapid impact analysis.
Network congestion: Spot RTT spikes (e.g., 2 ms → 180 ms) and pinpoint congested switches.
GPU workload monitoring: Track kernel launches and memory transfers in AI training clusters.
For more details, see the Cloud Monitor 2.0 integration guide and the OBI GitHub issue tracker.
[per-process eBPF tracers]
|
v
ringBufForwarder (reader goroutine + parser goroutine, object pool 2×BatchLength)
|
v
tracesInput (batch=100 / 1s / 3s idle‑flush)
|
v
ReadFromChannel → Routes → KubeDecorator → DockerDecorator → NameResolution → AttributesFilter
|
v
exportableSpans ===== fan‑out =====
|-- OTEL Traces Exporter
|-- Printer (debug)
|-- SpanNameLimiter → [OTEL Metrics | SvcGraph Metrics | Prometheus]
`-- BPF MetricsKey configuration knobs that affect back‑pressure and queue sizes include OTEL_EBPF_OTLP_TRACES_BATCH_MAX_SIZE, OTEL_EBPF_OTLP_TRACES_QUEUE_SIZE, OTEL_EBPF_CHANNEL_BUFFER_LEN, OTEL_EBPF_CHANNEL_SEND_TIMEOUT, OTEL_EBPF_BPF_BATCH_LENGTH and OTEL_EBPF_BPF_BATCH_TIMEOUT. When any internal queue blocks, OBI reports the relevant knob so operators can tune the system.
In summary, OBI is a swarm -orchestrated DAG of agents, a deadlock‑detecting fan‑out queue system, and a dual‑goroutine ring‑buffer forwarder that together provide zero‑code, full‑stack observability for heterogeneous cloud‑native workloads.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
