Operations 18 min read

Evolution of Application Monitoring at 得物: From CAT to OpenTelemetry

After rebuilding its transaction system in 2020, 得物 progressed from the basic CAT monitoring tool to OpenTracing with Prometheus, and finally adopted OpenTelemetry to unify metrics, traces, and logs via a custom vmagent‑Kafka‑Flink pipeline, dynamic sampling, and extensible javaagents, positioning the platform for a performance‑analysis‑driven future.

DeWu Technology
DeWu Technology
DeWu Technology
Evolution of Application Monitoring at 得物: From CAT to OpenTelemetry

In March 2020, 得物 completed a three‑month rebuild of its transaction system (五彩石项目), moving to a microservices architecture and highlighting the need for robust application monitoring.

The first monitoring stage used the open‑source CAT system, which provided basic performance reports, health checks, and real‑time alerts but lacked end‑to‑end trace visualization and customizable dashboards.

To overcome CAT’s limits, the second stage adopted OpenTracing for distributed tracing combined with Prometheus for metrics, introducing Endpoint‑tagged metrics via Micrometer and integrating through Spring Boot Starter; however, version skew and tight coupling hindered rapid iteration.

The third stage embraced OpenTelemetry, unifying metrics, traces, and logs. Exemplar‑enabled metrics were harvested via a customized vmagent → Kafka → Flink → Clickhouse pipeline, enabling 100% trace sampling. A control plane allowed dynamic sampling, Arthas integration, runtime toggles, and a custom javaagent launcher (Promise) reduced dependency conflicts. Extensions added RPC ID, custom TraceID generation, async flags, and profiling support via Arthas and Pyroscope.

Looking ahead, 得物 plans to fuse tracing with diagnostic tools to enter a performance‑analysis‑driven era.

[plugins]
enables = shadower,arthas,pyroscope,chaos-agent
[shadower]
artifact_key = /javaagent/shadower-%s-final.jar
boot_class = com.shizhuang.apm.javaagent.bootstrap.AgentBootStrap
classloader = system
default_version = 115.16
[arthas]
artifact_key = /tools/arthas-bin.zip
;boot_class = com.taobao.arthas.agent334.AgentBootstrap
boot_artifact = arthas-agent.jar
premain_args = .attachments/arthas/arthas-core.jar;;ip=127.0.0.1
[pyroscope]
artifact_key = /tools/pyroscope.jar
[chaos-agent]
artifact_key = /javaagent/chaos-agent.jar
boot_class = com.chaos.platform.agent.DewuChaosAgentBootstrap
classloader = system
apply_envs = dev,test,local,pre,xdw
32位自定义traceId:c0a8006b62583a724327993efd1865d8
c0a8006b  62583a72   4327993efd1865d8
   |         |             |
高8位(IP) 中8位(Timestmap) 低16位(Random)
shadower_virtual_field_map_operation_seconds_bucket{holder="Filter:Factory",key="WebMvcMetricsFilter",operation="get",tcl="AppClassLoader",value="Servlet3FilterMappingResolverFactory",le="0.2"} 3949.0 1654575981.216 # {span_id="48f29964fceff582",trace_id="c0a80355629ed36bcd8fb1c6c89dedfe"} 1.0 1654575979.751
MonitoringMicroservicesObservabilityOpenTelemetryCATOpenTracing
DeWu Technology
Written by

DeWu Technology

A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.