Tagged articles

OpenTelemetry

145 articles · Page 2 of 2
Architect
Architect
Feb 1, 2024 · Backend Development

Design and Optimization of Trace2.0: A High‑Performance Backend Tracing System

Trace2.0 is an OpenTelemetry‑based application monitoring system that processes petabyte‑scale trace data using multi‑channel client protocols, gRPC, load‑balancing optimizations, ZSTD compression, Kafka pipelines, ClickHouse storage, and a JDK 21 upgrade with virtual threads, achieving significant performance and cost improvements.

ClickHouseJDK21OpenTelemetry
0 likes · 15 min read
Design and Optimization of Trace2.0: A High‑Performance Backend Tracing System
Tencent Cloud Developer
Tencent Cloud Developer
Jan 9, 2024 · Operations

Tencent Cloud APM Full-Link Tracing Implementation and Best Practices

The article explains how Tencent Cloud APM implements full‑link tracing using OpenTelemetry standards, addresses challenges such as protocol compatibility, massive trace storage, and bytecode overhead with solutions like conversion gateways, tail sampling and thread profiling, and showcases best‑practice scenarios for topology analysis, front‑end/back‑end integration, and log‑trace correlation within the broader TCOP observability suite.

APMCloud MonitoringFull‑Link Tracing
0 likes · 11 min read
Tencent Cloud APM Full-Link Tracing Implementation and Best Practices
DevOps Cloud Academy
DevOps Cloud Academy
Dec 14, 2023 · Operations

CI/CD Observability via OpenTelemetry at Grafana Labs

The article explains the importance of CI/CD observability, outlines common pipeline problems, introduces Grafana's GraCIe plugin built on OpenTelemetry, and discusses how enhanced visibility can improve reliability, decision‑making, and future standardization across CI/CD platforms.

CI/CDMonitoringObservability
0 likes · 13 min read
CI/CD Observability via OpenTelemetry at Grafana Labs
37 Interactive Technology Team
37 Interactive Technology Team
Dec 4, 2023 · Backend Development

Root Cause Analysis of Missing Trace Data in Go Services Using Prometheus Metrics and GZIP Compression

The missing trace data in two Go services was caused by the GoFrame tracing middleware recording the gzip‑compressed /metrics response body as a UTF‑8 string, which the OpenTelemetry exporter rejected as invalid UTF‑8; disabling Prometheus compression or decompressing the body before logging resolves the issue.

ObservabilityOpenTelemetryTracing
0 likes · 16 min read
Root Cause Analysis of Missing Trace Data in Go Services Using Prometheus Metrics and GZIP Compression
DeWu Technology
DeWu Technology
Nov 15, 2023 · Backend Development

Thread Profiling: Design and Implementation of Client‑Server Performance Analysis

Thread profiling uses threshold‑triggered tasks on business threads to capture stack snapshots, which a dedicated profiler thread sends via high‑performance gRPC to a server that queues them in Kafka, enriches and stores them in ClickHouse, correlates with OpenTelemetry traces, and provides metrics that let developers quickly pinpoint latency bottlenecks and improve system stability.

GoJavaOpenTelemetry
0 likes · 11 min read
Thread Profiling: Design and Implementation of Client‑Server Performance Analysis
Ops Development Stories
Ops Development Stories
Oct 27, 2023 · Cloud Native

Collect Kubernetes Logs with OpenTelemetry and Loki Using Helm

This guide walks through deploying Loki via Helm, configuring the OpenTelemetry Collector to use a filelog receiver and Loki exporter, and enabling Kubernetes event collection, providing step‑by‑step commands and YAML snippets for a complete logging pipeline in a Kubernetes cluster.

CollectorLoggingOpenTelemetry
0 likes · 17 min read
Collect Kubernetes Logs with OpenTelemetry and Loki Using Helm
Architect
Architect
Oct 26, 2023 · Big Data

Design and Optimization of Bilibili Log Service 2.0 Using ClickHouse and OpenTelemetry

This article details Bilibili's evolution of its log system from an Elastic Stack‑based solution to a ClickHouse‑backed architecture with OpenTelemetry, describing the challenges of cost, stability, and scalability, the new components such as Log‑Agent, Log‑Ingester, and a custom visualization platform, and the performance gains and future directions.

ClickHouseObservabilityOpenTelemetry
0 likes · 26 min read
Design and Optimization of Bilibili Log Service 2.0 Using ClickHouse and OpenTelemetry
Ops Development Stories
Ops Development Stories
Oct 12, 2023 · Cloud Native

How to Monitor Kubernetes with OpenTelemetry Collector: Step‑by‑Step Helm Deployment

This guide walks through installing OpenTelemetry Collector on a Kubernetes cluster using Helm, configuring DaemonSet and Deployment collectors, integrating Prometheus for metrics, and customizing receivers, processors, and exporters to achieve comprehensive observability of nodes, pods, containers, and cluster resources.

ObservabilityOpenTelemetryhelm
0 likes · 26 min read
How to Monitor Kubernetes with OpenTelemetry Collector: Step‑by‑Step Helm Deployment
MaGe Linux Operations
MaGe Linux Operations
Sep 30, 2023 · Cloud Native

How DeWu Built a Scalable Cloud‑Native Trace2.0 Observability Platform

This article details DeWu's evolution from a sneaker marketplace to a full‑stack e‑commerce platform and explains how its cloud‑native monitoring system, based on OpenTelemetry, ClickHouse, and object storage, was architected, optimized, and scaled to handle billions of spans daily.

ObservabilityOpenTelemetrycloud-native
0 likes · 16 min read
How DeWu Built a Scalable Cloud‑Native Trace2.0 Observability Platform
ZhongAn Tech Team
ZhongAn Tech Team
Sep 1, 2023 · Backend Development

Investigation and Fix of OpenTelemetry ThreadPool Trace Propagation Bug in Non‑Capturing Lambda Scenarios

This article analyzes a sporadic loss of trace information when using OpenTelemetry’s non‑capturing lambda tasks in a Java ThreadPoolExecutor, explains the underlying cause related to Runnable reuse and lambda caching, and presents the community‑driven patches that correctly propagate context across threads.

BugFixJavaLambda
0 likes · 10 min read
Investigation and Fix of OpenTelemetry ThreadPool Trace Propagation Bug in Non‑Capturing Lambda Scenarios
MaGe Linux Operations
MaGe Linux Operations
May 11, 2023 · Cloud Native

Master Distributed Tracing in Go with OpenTelemetry – A Practical Guide

In modern cloud‑native applications, distributed tracing is essential for pinpointing errors across microservices, and OpenTelemetry provides a standardized framework for collecting and analyzing trace data, with a hands‑on Go implementation demonstrated in an upcoming expert-led workshop.

Cloud NativeDistributed TracingGo
0 likes · 5 min read
Master Distributed Tracing in Go with OpenTelemetry – A Practical Guide
政采云技术
政采云技术
Apr 29, 2023 · Cloud Native

Understanding Observability: Challenges, Principles, and OpenTelemetry Architecture

The article explains how growing system complexity drives the need for observability, outlines the three pillars of logs, traces, and metrics, compares traditional stability stacks with modern observability, and details OpenTelemetry's design, advantages, and implementation considerations for cloud‑native environments.

MicroservicesMonitoringObservability
0 likes · 16 min read
Understanding Observability: Challenges, Principles, and OpenTelemetry Architecture
Alibaba Cloud Native
Alibaba Cloud Native
Mar 28, 2023 · Cloud Native

How RocketMQ 5.0 Enables Distributed End‑to‑End Tracing with OpenTelemetry

This article explains how Apache RocketMQ 5.0 integrates standardized distributed tracing via OpenTelemetry, detailing the underlying span model, semantic conventions for messaging, automatic and manual instrumentation options, configuration steps, a complete example workflow, and how to export traces to Alibaba Cloud SLS and ARMS for observability.

Cloud NativeDistributed TracingObservability
0 likes · 17 min read
How RocketMQ 5.0 Enables Distributed End‑to‑End Tracing with OpenTelemetry
Ops Development Stories
Ops Development Stories
Feb 6, 2023 · Cloud Native

How to Deploy Odigos for Zero‑Code Observability on Kubernetes

This guide walks you through installing and configuring the open‑source Odigos observability control plane on a Kubernetes cluster, showing how to automatically collect traces, metrics, and logs from applications without modifying code and how to visualize the data with Grafana.

OdigosOpenTelemetrycloud-native
0 likes · 11 min read
How to Deploy Odigos for Zero‑Code Observability on Kubernetes
Cloud Native Technology Community
Cloud Native Technology Community
Jan 30, 2023 · Cloud Native

2023 Cloud‑Native Trends and Predictions: Cloud IDEs, FinOps, SBOM, GitOps, OpenTelemetry, WebAssembly and More

The article surveys the 2023 cloud‑native landscape, highlighting the rise of cloud‑based IDEs, the mainstreaming of FinOps and GreenOps, the ubiquity of open‑source SBOMs, the maturation of GitOps and OpenTelemetry, the growing impact of WebAssembly, and several related forecasts for the industry.

Cloud NativeFinOpsGitOps
0 likes · 21 min read
2023 Cloud‑Native Trends and Predictions: Cloud IDEs, FinOps, SBOM, GitOps, OpenTelemetry, WebAssembly and More
dbaplus Community
dbaplus Community
Jan 26, 2023 · Operations

Unified Metrics, Tracing, and Logging: A Financial Firm’s Path to Microservice Observability

Facing the challenges of distributed microservice architectures, a financial services company implemented a unified observability platform that combines metrics, tracing, and logging via OpenTelemetry and custom agents, enabling real‑time visualization, anomaly detection, and performance analysis across seven core business middle‑platforms.

Distributed TracingLoggingMetrics
0 likes · 17 min read
Unified Metrics, Tracing, and Logging: A Financial Firm’s Path to Microservice Observability
Alibaba Terminal Technology
Alibaba Terminal Technology
Jan 5, 2023 · Mobile Development

Why Mobile Trace Is Hard and How OpenTelemetry Solves It

This article explores the challenges of end‑to‑end tracing on mobile apps, explains why issues are hard to reproduce, and presents a four‑step solution using a unified OpenTelemetry standard, automated data linking, performance optimizations, and machine‑learning‑driven root‑cause analysis.

AndroidObservabilityOpenTelemetry
0 likes · 20 min read
Why Mobile Trace Is Hard and How OpenTelemetry Solves It
DeWu Technology
DeWu Technology
Dec 5, 2022 · Operations

Evolution of Application Monitoring at 得物: From CAT to OpenTelemetry

After rebuilding its transaction system in 2020, 得物 progressed from the basic CAT monitoring tool to OpenTracing with Prometheus, and finally adopted OpenTelemetry to unify metrics, traces, and logs via a custom vmagent‑Kafka‑Flink pipeline, dynamic sampling, and extensible javaagents, positioning the platform for a performance‑analysis‑driven future.

CATMicroservicesMonitoring
0 likes · 18 min read
Evolution of Application Monitoring at 得物: From CAT to OpenTelemetry
ITPUB
ITPUB
Nov 4, 2022 · Cloud Native

Build a Full‑Stack Observability Platform with Grafana LGTM, Go, and OpenTelemetry

This guide walks you through creating a complete observability stack—exporting metrics, traces, and logs from a Go web service, collecting them with OpenTelemetry Collector, and storing them in Grafana Mimir, Loki, and Tempo, then visualizing everything on a unified Grafana dashboard.

DockerGoOpenTelemetry
0 likes · 9 min read
Build a Full‑Stack Observability Platform with Grafana LGTM, Go, and OpenTelemetry
dbaplus Community
dbaplus Community
Oct 19, 2022 · Backend Development

How Trace2.0 Cuts Tracing Costs by 66% with Tail Sampling and ClickHouse

This article details the design of Trace2.0, a next‑generation distributed tracing platform built on OpenTelemetry, covering its end‑to‑end architecture, tail sampling with hot‑cold storage, Bloom‑filter implementation, and a self‑built ClickHouse storage layer that reduces storage costs by two‑thirds while improving query performance.

ClickHouseOpenTelemetrybackend-architecture
0 likes · 14 min read
How Trace2.0 Cuts Tracing Costs by 66% with Tail Sampling and ClickHouse
Big Data Technology Architecture
Big Data Technology Architecture
Sep 17, 2022 · Databases

Design and Optimization of Bilibili Log Service 2.0 Using ClickHouse and OpenTelemetry

This article describes how Bilibili redesigned its log service by replacing Elasticsearch with ClickHouse, introducing OpenTelemetry‑based logging, optimizing storage, query, and alerting components, and enhancing ClickHouse features such as configuration tuning, Map types, and implicit columns to achieve higher performance, lower cost, and better observability.

ClickHouseObservabilityOpenTelemetry
0 likes · 28 min read
Design and Optimization of Bilibili Log Service 2.0 Using ClickHouse and OpenTelemetry
ITPUB
ITPUB
Sep 16, 2022 · Big Data

How Bilibili Re‑engineered Its Log Service with ClickHouse and OpenTelemetry for 10× Performance

Bilibili redesigned its five‑year‑old ELK‑based log platform by replacing Elasticsearch with ClickHouse, adopting OpenTelemetry for unified log ingestion, and building a custom visualization and alerting system, achieving tenfold write throughput, one‑third storage cost, and dramatically faster query response times.

ClickHouseOpenTelemetrylog infrastructure
0 likes · 28 min read
How Bilibili Re‑engineered Its Log Service with ClickHouse and OpenTelemetry for 10× Performance
Bilibili Tech
Bilibili Tech
Sep 16, 2022 · Big Data

Design and Optimization of Bilibili Log Service 2.0 Using ClickHouse and OpenTelemetry

Bilibili’s Log Service 2.0 replaces its Elastic‑Stack pipeline with an OpenTelemetry‑driven architecture that writes logs via high‑performance Go/Java SDKs to ClickHouse, delivering ten‑fold write throughput, two‑fold query speed, one‑third storage cost, a custom query gateway, visualization UI, and advanced alerting.

ClickHouseObservabilityOpenTelemetry
0 likes · 27 min read
Design and Optimization of Bilibili Log Service 2.0 Using ClickHouse and OpenTelemetry
Tencent Cloud Developer
Tencent Cloud Developer
Sep 7, 2022 · Cloud Native

Why Build Probe Capabilities Based on OpenTelemetry for Cloud‑Native Observability

Building probe capabilities on OpenTelemetry gives cloud‑native teams a vendor‑neutral, standardized way to extend monitoring into full observability—supporting large‑scale, language‑specific instrumentation, plug‑and‑play plugins, and seamless integration with APM backends—so developers and operators can detect, debug, and predict faults across distributed containers.

APMCloud NativeMonitoring
0 likes · 15 min read
Why Build Probe Capabilities Based on OpenTelemetry for Cloud‑Native Observability
DeWu Technology
DeWu Technology
Sep 2, 2022 · Operations

Design and Implementation of Trace2.0 Distributed Tracing Platform

Trace2.0 is an OpenTelemetry‑based distributed tracing platform that collects billions of spans daily, routes data through a control plane, OTel Server, and Kafka to ClickHouse hot‑cold storage with tail sampling, achieving 66% cost reduction, 12× compression, sub‑second query latency, and plans to offload raw spans to object storage.

ClickHouseDistributed TracingObservability
0 likes · 12 min read
Design and Implementation of Trace2.0 Distributed Tracing Platform
Java Architecture Diary
Java Architecture Diary
Aug 8, 2022 · Operations

How to Integrate Jaeger Tracing with Rainbond Using OpenTelemetry

This guide explains why distributed tracing is essential for micro‑service architectures, introduces Jaeger as an open‑source APM solution, and provides step‑by‑step instructions for deploying and configuring Jaeger on Rainbond with OpenTelemetry, including environment variables, service naming, and topology generation.

APMDistributed TracingJaeger
0 likes · 11 min read
How to Integrate Jaeger Tracing with Rainbond Using OpenTelemetry
Alibaba Cloud Native
Alibaba Cloud Native
Aug 2, 2022 · Cloud Native

How F6 Engineered Cloud‑Native Observability: From ELK to eBPF and OpenTelemetry

This article examines how F6 tackled growing stability demands by evolving from traditional ELK‑based logging to a cloud‑native observability stack that combines Grafana, Prometheus, eBPF, OpenTelemetry, and ARMS, illustrating practical steps, challenges, and lessons learned for modern microservice environments.

MonitoringOpenTelemetrycloud-native
0 likes · 17 min read
How F6 Engineered Cloud‑Native Observability: From ELK to eBPF and OpenTelemetry
Baidu Geek Talk
Baidu Geek Talk
Jul 19, 2022 · Cloud Native

How OpenTelemetry and Jaeger Power Cloud‑Native Tracing

This article explains cloud‑native observability, defines its three pillars—metrics, tracing, and logging—details the OpenTelemetry tracing data model and Span structure, reviews industry implementations such as Jaeger and Alibaba Eagle Eye, and shares practical challenges and solutions from real‑world production use.

Alibaba Eagle EyeCloud NativeJaeger
0 likes · 11 min read
How OpenTelemetry and Jaeger Power Cloud‑Native Tracing
Tencent Cloud Developer
Tencent Cloud Developer
Dec 23, 2021 · Cloud Native

An Overview of OpenTelemetry: Origins, Architecture, and Instrumentation

OpenTelemetry unifies tracing, metrics, and logs by merging OpenTracing and OpenCensus into a cross‑language specification, collector, language SDKs, and instrumentation libraries, offering vendor‑agnostic, low‑maintenance telemetry collection that separates data gathering from business logic while requiring external back‑ends for storage and analysis.

Cloud NativeCollectorInstrumentation
0 likes · 10 min read
An Overview of OpenTelemetry: Origins, Architecture, and Instrumentation
Qingyun Technology Community
Qingyun Technology Community
Dec 17, 2021 · Cloud Native

What’s New in Cilium 1.11? Service Mesh, BGP, XDP and More

Cilium 1.11 introduces a beta Service Mesh, Kubernetes Ingress support, OpenTelemetry integration, topology‑aware load balancing, BGP pod‑CIDR announcements, managed IPv4/IPv6 neighbor discovery, XDP multi‑device acceleration, graceful termination, scalable ID spaces, endpoint slices and several feature enhancements and deprecations.

BGPCiliumOpenTelemetry
0 likes · 31 min read
What’s New in Cilium 1.11? Service Mesh, BGP, XDP and More
Tencent Cloud Middleware
Tencent Cloud Middleware
Dec 9, 2021 · Cloud Native

Why Observability Is the Missing Piece for Day‑2 Success in Cloud‑Native and Serverless Systems

The article explains how observability—through logs, metrics, and traces—transforms the opaque, complex day‑2 operations of micro‑service, Kubernetes, and serverless environments into a deterministic, diagnosable system, highlighting OpenTelemetry, practical collection methods, and real‑world implementation challenges and benefits.

ObservabilityOpenTelemetryServerless
0 likes · 17 min read
Why Observability Is the Missing Piece for Day‑2 Success in Cloud‑Native and Serverless Systems
DevOps
DevOps
Aug 31, 2021 · Backend Development

Designing an Uber‑Like Microservice System with DDD, OpenTelemetry Observability, and Reinforced Chaos Engineering

This article describes how to model a complex Uber‑style ride‑hailing system using Domain‑Driven Design, implement it with Java Spring Boot microservices, instrument it with OpenTelemetry for full observability, and validate the observability pipeline through a gamified chaos‑engineering approach that reduces MTTR.

DDDJavaMicroservices
0 likes · 13 min read
Designing an Uber‑Like Microservice System with DDD, OpenTelemetry Observability, and Reinforced Chaos Engineering
Liulishuo Tech Team
Liulishuo Tech Team
Jun 2, 2021 · Backend Development

Understanding Distributed Tracing and Its Use at Liulishuo

This article explains what distributed tracing is, why it is needed alongside logging and metrics for observability, how it works with trace and span IDs, and describes Liulishuo's implementation using OpenTelemetry, W3C Trace Context, and tail‑based sampling to improve backend debugging.

Distributed TracingMicroservicesObservability
0 likes · 9 min read
Understanding Distributed Tracing and Its Use at Liulishuo
Alibaba Cloud Native
Alibaba Cloud Native
Mar 31, 2021 · Operations

What Is OpenTelemetry? A Deep Dive into Its Architecture, History, and Go Demo

OpenTelemetry, a CNCF observability project, standardizes telemetry data models, collection, processing, and export, offering vendor‑agnostic APIs, SDKs, and a configurable collector; the article explains its problem domain, solution components, history, future outlook, and provides a practical Go demo with code and configuration examples.

CNCFCollectorGo
0 likes · 11 min read
What Is OpenTelemetry? A Deep Dive into Its Architecture, History, and Go Demo
AntTech
AntTech
Sep 2, 2019 · Cloud Native

Exploring Observability in Cloud‑Native Architecture: Practices from Ant Financial

This article reviews Ant Financial's cloud‑native observability journey, covering its origins, the three pillars of tracing, metrics and logging, community projects like OpenTelemetry, practical implementations, sampling strategies, and future directions for unified microservice, mesh, and serverless monitoring.

MetricsOpenTelemetryTracing
0 likes · 15 min read
Exploring Observability in Cloud‑Native Architecture: Practices from Ant Financial