Operations 7 min read

Building an Observability System Traffic Distribution Diagram

This article explains how to design and implement a traffic distribution diagram for an observability system, covering current cloud‑native tooling, data standardization, transformation, traffic‑flow modeling, aggregation, storage with ClickHouse, and visualisation techniques such as Sankey diagrams.

Yum! Tech Team
Yum! Tech Team
Yum! Tech Team
Building an Observability System Traffic Distribution Diagram

Building an Observability System Traffic Distribution Diagram

With the rapid growth of cloud‑native technologies, observability has become a core element of modern application development, deployment, and maintenance. The article outlines the motivation, methodology, and results of constructing a system‑service traffic distribution map.

Current Situation

The Cloud Native Computing Foundation (CNCF) supports many observability tools for metrics, logs, and tracing. In our company we use VictoriaMetrics, SkyWalking, and ELK for metrics, tracing, and log collection respectively, but data integration remains a challenge.

To help developers grasp the overall system state, we propose a visual traffic distribution map.

Basic Data Modeling

2.1 Data Standardization

Standardizing and processing data is essential in observability. The following naming conventions are used to keep consistency across systems:

Name

Meaning

Description

Level

biz

Business line

Business system of the service

1

plt

Product line

Product system of the service

2

sid

System

Independent external service system

3

mdl

Module

Collection of services with the same function

4

srv

Service

Aggregation of identical service instances

5

sc

Cluster

Aggregation of identical service instances in a single availability zone

6

si

Instance

Single service instance

7

2.2 Data Transformation

CI/CD System Refactoring

Refactor the CI/CD pipeline to be non‑intrusive to user code while ensuring metric‑related code follows the standard.

Log Printing Refactoring

Log tags must comply with the above conventions, and logs should include traceId and tracing data for correlation.

Tracing Data Collection

Since we use SkyWalking, we modify application start‑up tags so that SkyWalking attributes align with the standardized data.

System Traffic Modeling

The traffic model describes how requests enter through ingress points, flow between internal services, and finally reach middleware or downstream systems.

Data Processing and Storage

3.1 Data Processing

To reflect the overall system status, we perform the following aggregation steps:

Time‑based aggregation (1 min, 5 min, 10 min)

Handling data from different availability zones

Abstracting ingress‑point data

Identifying micro‑service and data information

3.2 Data Storage

ClickHouse is chosen as the storage engine for its column‑store efficiency, fast analytical capabilities, and low‑latency query performance.

Data Visualization

4.1 Chart Selection

A Sankey diagram is used to display traffic distribution.

4.2 Color Design

Bright colors indicate abnormal states

Neutral colors represent normal states

Additional legends are added for clarity

4.3 Data Normalization

Traffic values are normalized into five levels to keep the diagram concise and clear.

4.4 Focus Filtering

Exclude configuration‑center and registry‑center data

Simplify complex service structures

4.5 Demo

Display by availability‑zone dimension:

Conclusion

The system‑service traffic distribution diagram provides a macro view of the overall system health, allowing developers and operators to quickly spot potential issues. The accumulated raw data can be further analyzed to continuously deliver business value.

Cloud Nativeobservabilitydata modelingvisualizationtraffic diagram
Yum! Tech Team
Written by

Yum! Tech Team

How we support the digital platform of China's largest restaurant group—technology behind hundreds of millions of consumers and over 12,000 stores.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.