Evolution of Application Monitoring at 得物: From CAT to OpenTelemetry
After rebuilding its transaction system in 2020, 得物 progressed from the basic CAT monitoring tool to OpenTracing with Prometheus, and finally adopted OpenTelemetry to unify metrics, traces, and logs via a custom vmagent‑Kafka‑Flink pipeline, dynamic sampling, and extensible javaagents, positioning the platform for a performance‑analysis‑driven future.
In March 2020, 得物 completed a three‑month rebuild of its transaction system (五彩石项目), moving to a microservices architecture and highlighting the need for robust application monitoring.
The first monitoring stage used the open‑source CAT system, which provided basic performance reports, health checks, and real‑time alerts but lacked end‑to‑end trace visualization and customizable dashboards.
To overcome CAT’s limits, the second stage adopted OpenTracing for distributed tracing combined with Prometheus for metrics, introducing Endpoint‑tagged metrics via Micrometer and integrating through Spring Boot Starter; however, version skew and tight coupling hindered rapid iteration.
The third stage embraced OpenTelemetry, unifying metrics, traces, and logs. Exemplar‑enabled metrics were harvested via a customized vmagent → Kafka → Flink → Clickhouse pipeline, enabling 100% trace sampling. A control plane allowed dynamic sampling, Arthas integration, runtime toggles, and a custom javaagent launcher (Promise) reduced dependency conflicts. Extensions added RPC ID, custom TraceID generation, async flags, and profiling support via Arthas and Pyroscope.
Looking ahead, 得物 plans to fuse tracing with diagnostic tools to enter a performance‑analysis‑driven era.
[plugins]
enables = shadower,arthas,pyroscope,chaos-agent
[shadower]
artifact_key = /javaagent/shadower-%s-final.jar
boot_class = com.shizhuang.apm.javaagent.bootstrap.AgentBootStrap
classloader = system
default_version = 115.16
[arthas]
artifact_key = /tools/arthas-bin.zip
;boot_class = com.taobao.arthas.agent334.AgentBootstrap
boot_artifact = arthas-agent.jar
premain_args = .attachments/arthas/arthas-core.jar;;ip=127.0.0.1
[pyroscope]
artifact_key = /tools/pyroscope.jar
[chaos-agent]
artifact_key = /javaagent/chaos-agent.jar
boot_class = com.chaos.platform.agent.DewuChaosAgentBootstrap
classloader = system
apply_envs = dev,test,local,pre,xdw 32位自定义traceId:c0a8006b62583a724327993efd1865d8
c0a8006b 62583a72 4327993efd1865d8
| | |
高8位(IP) 中8位(Timestmap) 低16位(Random) shadower_virtual_field_map_operation_seconds_bucket{holder="Filter:Factory",key="WebMvcMetricsFilter",operation="get",tcl="AppClassLoader",value="Servlet3FilterMappingResolverFactory",le="0.2"} 3949.0 1654575981.216 # {span_id="48f29964fceff582",trace_id="c0a80355629ed36bcd8fb1c6c89dedfe"} 1.0 1654575979.751DeWu Technology
A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.