Operations 7 min read

Building an Observability System Traffic Distribution Diagram

This article explains how to design and implement a traffic distribution diagram for an observability system, covering current cloud‑native tooling, data standardization, transformation, traffic‑flow modeling, aggregation, storage with ClickHouse, and visualisation techniques such as Sankey diagrams.

Yum! Tech Team

Mar 1, 2024

Building an Observability System Traffic Distribution Diagram

With the rapid growth of cloud‑native technologies, observability has become a core element of modern application development, deployment, and maintenance. The article outlines the motivation, methodology, and results of constructing a system‑service traffic distribution map.

Current Situation

The Cloud Native Computing Foundation (CNCF) supports many observability tools for metrics, logs, and tracing. In our company we use VictoriaMetrics, SkyWalking, and ELK for metrics, tracing, and log collection respectively, but data integration remains a challenge.

To help developers grasp the overall system state, we propose a visual traffic distribution map.

Basic Data Modeling

2.1 Data Standardization

Standardizing and processing data is essential in observability. The following naming conventions are used to keep consistency across systems:

Name

Meaning

Description

Level

biz

Business line

Business system of the service

plt

Product line

Product system of the service

sid

System

Independent external service system

mdl

Module

Collection of services with the same function

srv

Service

Aggregation of identical service instances

Cluster

Aggregation of identical service instances in a single availability zone

Instance

Single service instance

2.2 Data Transformation

CI/CD System Refactoring

Refactor the CI/CD pipeline to be non‑intrusive to user code while ensuring metric‑related code follows the standard.

Log Printing Refactoring

Log tags must comply with the above conventions, and logs should include traceId and tracing data for correlation.

Tracing Data Collection

Since we use SkyWalking, we modify application start‑up tags so that SkyWalking attributes align with the standardized data.

System Traffic Modeling

The traffic model describes how requests enter through ingress points, flow between internal services, and finally reach middleware or downstream systems.

Data Processing and Storage

3.1 Data Processing

To reflect the overall system status, we perform the following aggregation steps:

Time‑based aggregation (1 min, 5 min, 10 min)

Handling data from different availability zones

Abstracting ingress‑point data

Identifying micro‑service and data information

3.2 Data Storage

ClickHouse is chosen as the storage engine for its column‑store efficiency, fast analytical capabilities, and low‑latency query performance.

Data Visualization

4.1 Chart Selection

A Sankey diagram is used to display traffic distribution.

4.2 Color Design

Bright colors indicate abnormal states

Neutral colors represent normal states

Additional legends are added for clarity

4.3 Data Normalization

Traffic values are normalized into five levels to keep the diagram concise and clear.

4.4 Focus Filtering

Exclude configuration‑center and registry‑center data

Simplify complex service structures

4.5 Demo

Display by availability‑zone dimension:

Conclusion

The system‑service traffic distribution diagram provides a macro view of the overall system health, allowing developers and operators to quickly spot potential issues. The accumulated raw data can be further analyzed to continuously deliver business value.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

cloud-native Observability data modeling visualization traffic diagram

Written by

Yum! Tech Team

How we support the digital platform of China's largest restaurant group—technology behind hundreds of millions of consumers and over 12,000 stores.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.