Operations 8 min read

Overwatch: A Distributed Real‑Time RPC Monitoring Platform for System Observability

The article describes Overwatch, a distributed monitoring system developed by Dada‑JD Daojia that collects, aggregates, and visualizes RPC traffic in real time using consumer‑side agents, Kafka, Storm, and a Node.js CQRS architecture, enabling engineers to quickly locate and resolve service failures.

JD Tech
JD Tech
JD Tech
Overwatch: A Distributed Real‑Time RPC Monitoring Platform for System Observability

Background: Dada‑JD Daojia's backend consists of numerous microservices generating massive RPC traffic, making fault isolation difficult.

To address this, the Overwatch monitoring platform was developed to collect, aggregate, and visualize RPC data in real time.

Data collection is performed by agents in consumer services, sending RPC metrics via Kafka; Storm aggregates the streams, and the Node.js Overwatch service stores and serves the results.

Two monitoring approaches were considered: provider‑side log monitoring and consumer‑side instrumentation; the latter was chosen for objective error detection.

Visualization uses directed graphs where nodes represent services and concentric circles encode recent success rates (1 min, 5 min, 15 min) with color gradients, while edge colors indicate inter‑service call health.

To support low‑latency queries, Overwatch adopts a CQRS architecture separating command (data ingestion) and query (read) models.

The platform has been deployed successfully, handling peak loads of 4 million orders per day, and continues to evolve with support for additional data sources, RPC protocols, and fine‑grained metrics.

real-timeRPCKafkaCQRSvisualizationnodejsDistributed Monitoring
JD Tech
Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.