Unlocking Observability: A Complete Guide to OpenTelemetry Architecture and APIs
This article explains what OpenTelemetry is, its core components, key terminology, benefits, usage steps, and detailed architecture—including APIs, SDK pipelines, and the collector—providing a comprehensive overview for developers and operators seeking vendor‑neutral observability solutions.
What is OpenTelemetry?
OpenTelemetry merges the OpenTracing and OpenCensus projects, offering a set of APIs and libraries that standardize the collection and transmission of telemetry data. It provides a secure, vendor‑neutral toolset that can send data to various back‑ends as needed.
The OpenTelemetry project consists of the following components:
Promotes the use of a consistent specification across all projects.
APIs based on the specification, including interfaces and implementations.
Language‑specific SDKs (implementations of the APIs) such as Java, Python, Go, Erlang, etc.
Exporters that can forward data to a chosen back‑end.
Collectors – vendor‑neutral implementations for processing and exporting telemetry data.
Terminology
If you are new to OpenTelemetry, you need to understand the following terms:
Traces: Records of request activity flowing through a distributed system; a trace is a directed acyclic graph of spans.
Spans: Time‑based operations within a trace, forming a tree structure with a root span that represents end‑to‑end latency.
Metrics: Raw runtime measurements about a service. OpenTelemetry defines metric instruments such as Counters, UpDownCounters, ValueRecorders, SumObservers, UpDownSumObservers, and ValueObservers.
Context: Each span carries a span context , a globally unique identifier, and may include a correlation context with user‑defined attributes (optional).
Context propagation: The mechanism for transmitting context information between services, typically via HTTP headers, which is essential for tracing and can also support use cases like A/B testing.
Benefits of OpenTelemetry
By consolidating OpenTracing and OpenCensus into a single open standard, OpenTelemetry offers several advantages:
Simple choice: No need to pick between two standards; OpenTelemetry is compatible with both.
Cross‑platform: Supports many languages and back‑ends, providing a vendor‑neutral way to capture and transmit telemetry.
Simplified observability: High‑quality telemetry enables high‑quality monitoring, encouraging vendors to adopt the unified standard.
How to Use OpenTelemetry
The APIs and SDKs include quick‑start guides and documentation. For example, the Java guide shows how to obtain a tracer, create spans, add attributes, and propagate context across spans.
After instrumenting an application with OpenTelemetry trace APIs, you can use built‑in exporters to send trace data to observation platforms such as New Relic or other back‑ends.
Metrics and logs specifications are still evolving, but once completed they will ensure that libraries and frameworks expose all built‑in telemetry types without additional detection.
OpenTelemetry Architecture Components
The default implementation is divided into three main parts:
OpenTelemetry API
OpenTelemetry SDK, which includes:
Tracer pipeline
Meter pipeline
Shared Context layer
Collector
OpenTelemetry API
Application developers use the OpenTelemetry API to instrument code, while library authors embed instrumentation directly in their libraries. The API does not handle data transport.
The API consists of four parts:
A Tracer API
A Metrics API
A Context API
Semantic conventions
Tracer API
The Tracer API supports creating spans, each of which can be assigned a traceId and optional timestamps. A tracer tags spans with a name and version, linking them to the originating library.
Metric API
The Metric API provides various metric instruments such as Counters and Observers . Counters allow aggregation of measurements, while Observers capture values at discrete points (e.g., CPU load or free disk space).
Context API
The Context API adds context information to spans and traces, supporting standards like W3C Trace Context, Zipkin B3, or New Relic distributed tracing headers. It also enables propagation of context across system boundaries, and metric instruments can access the current context.
Semantic Conventions
The API includes a set of semantic conventions that define naming, attributes, and error handling for spans, ensuring consistent APM experiences across languages and vendors.
OpenTelemetry SDK
The SDK implements the API and contains three components similar to the API: a Tracer, a Meter, and a shared Context layer.
Ideally the SDK satisfies 99% of standard use cases, but it can be customized—for example, by replacing the default sampling algorithm in the Tracer pipeline.
Tracer pipeline
When configuring the SDK, one or more SpanProcessors are attached to the Tracer pipeline. SpanProcessors observe span lifecycles and forward completed spans to a SpanExporter. The SDK includes a simple SpanProcessor that directly forwards spans to an exporter.
The SDK also provides a batch processor that periodically sends completed spans. Custom SpanProcessors can be implemented to add bespoke behavior, such as exporting "in‑progress" spans if the back‑end supports it.
The final component of the Tracer pipeline is the SpanExporter, which converts spans into the format required by a back‑end and forwards them. Implementing a custom SpanExporter is the easiest way for telemetry vendors to integrate with OpenTelemetry.
Meter pipeline
The Meter pipeline creates and maintains various metric instruments, including Counters and Observers. By default, Counters aggregate by summation, while Observers retain the last recorded value. All instruments have a default aggregation.
Aggregated metric data is sent to a MetricExporter. Vendors can provide exporters that translate aggregated data into the format required by their back‑ends.
OpenTelemetry supports two exporter models: push‑based exporters that send data at intervals, and pull‑based exporters that respond to back‑end requests. New Relic uses a push model, while Prometheus uses pull.
Shared Context layer
The shared Context layer sits between the Tracer and Meter pipelines, allowing non‑observer metrics to be recorded within the context of an executing span. Propagators can customize context propagation, and the SDK includes an implementation based on the W3C Trace Context specification, with optional support for Zipkin B3.
Collector
The following description of the collector is taken from the official documentation.
The OpenTelemetry Collector provides a vendor‑neutral implementation that seamlessly receives, processes, and exports telemetry data, removing the need to maintain multiple open‑source formats (e.g., Jaeger, Prometheus) for different back‑ends.
Link: https://www.cnblogs.com/charlieroro/p/13862471.html
(© Original author, removed upon request)
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
