Operations 14 min read

Microservice Architecture Visualization: Practices and Benefits at Alibaba

The article explains why visualizing microservice architectures is essential for high availability, describes common and advanced visualization methods, discusses how to make visualization effective, handle architectural changes, identify key components, and leverage visual data for operations and reliability improvements.

High Availability Architecture

Dec 13, 2018

Microservice Architecture Visualization: Practices and Benefits at Alibaba

Introduction: After adopting a microservice architecture, understanding the relationships and dependencies between services becomes challenging. The actual post‑migration architecture often differs significantly from the expected model, requiring architects and operators to precisely know resource instances and their interactions. Alibaba engineers share their experience in microservice visualization.

Author: Yan Mingming (nickname: Xin Yuan), Senior Development Engineer in Alibaba Group’s Security Production High‑Availability Architecture Team and R&D lead of Alibaba Cloud Application High‑Availability Service (AHAS). He previously founded a cross‑border e‑commerce system and joined Taobao in 2011, focusing on Chaos Engineering.

Why Architecture Visualization Is Needed

As enterprises migrate to microservices, system complexity and frequent changes make it hard for architects or operators to remember all resource instances and their interactions. Dynamic evolution can also introduce weak dependencies, capacity bottlenecks, or excessive coupling, creating serious stability risks. Visualizing the architecture helps identify problems and build highly available systems.

(Diagram from Daniel Woods on microservices)

Benefits of architecture visualization include:

Defining System Boundaries A good diagram clearly shows components and core call relationships, reflecting both system and business domain boundaries.

Identifying Architectural Issues Combined with high‑availability guidelines, visual diagrams help assess safety risks such as disaster recovery, isolation, and self‑healing. Tools like "Eagle Eye" have greatly improved developers' troubleshooting efficiency.

Improving System Availability When a fault occurs, developers can quickly locate the source using dependency maps, dramatically reducing MTTR. The diagram also reveals strong/weak dependencies, enabling degradation of weak links during peak traffic or fault‑injection testing.

Common Practices for Architecture Visualization

Traditional static PPT diagrams quickly become outdated. Manual updates are error‑prone, so teams often use automated methods. One common approach is the "instrumentation‑based perception" method, which relies on data‑point collection (distributed tracing, APM) to generate visualizations.

We call this the "instrumentation‑based perception" method. Its drawbacks are:

Language dependence: different languages require different instrumentation packages.

Maintenance difficulty: core‑class detection must be updated when components change.

Limited extensibility: client‑side detection cannot recognize new components until the client is updated.

Scalability limits: server‑side recognition can leverage big‑data analysis for more accurate identification.

Another approach, "boundary‑less perception", is language‑agnostic. It collects basic metadata from processes, containers, monitoring, and network data on the host, then builds the architecture graph on the server side.

What Else Can Architecture Visualization Do?

To reduce visualization cost, we use a non‑intrusive method that gathers process and network call data to construct service relationships.

How to Make Visualization More Effective?

Effectiveness depends on the user's cognitive level. Developers need application‑level views, while architects and managers need system‑level views. We therefore provide multi‑layer diagrams: process layer, container layer, and host layer, with future extensions to region or service layers.

Below is a three‑layer visualization on an Alibaba Cloud ECS instance.

Handling Architectural Variability

No system remains static; architectures evolve with version releases. Our visualization product automatically refreshes diagrams over time and supports historical snapshots, allowing users to compare pre‑ and post‑release architectures for compliance with high‑availability principles.

Core of Architecture Visualization

The core is to find meaningful and effective element views and their relationships. A good product filters out irrelevant information and presents valuable views, especially for complex microservice call chains. This requires accurate identification of processes, network calls, and their significance.

Element Identification in Visualization

We categorize elements into three groups: own application services, external resource dependencies, and host information. External dependencies include other applications, middleware, and storage services. For cloud‑native apps, recognizing cloud services is especially important.

We have already implemented identification for 21 common third‑party components such as Redis, MySQL, and Tomcat, and the library continues to expand.

(The diagram shows node request flow and basic monitoring information.)

(The diagram displays part of the process information on the host.)

What Can Be Done After Visualization?

Visualization is not the goal but a means to achieve high availability. By collecting architecture data, we can automatically match components (e.g., MySQL, Redis, MQ) with a fault library, discover potential failures, and conduct fault‑injection drills. For high‑load Java applications, combined with rate‑limiting components, availability can be further improved.

(How architecture perception assists system rate‑limiting configuration.)

Visualization provides an efficient operations and control window. By enriching cloud‑native data, integrating monitoring and container services, and deep‑mining intelligent consumption, we aim to turn data into core enterprise value and a tool for business stability.

(Please credit the source when reposting. Technical original and architecture practice articles are welcome for submission via the public account menu “Contact Us”.)

High‑Availability Architecture

Changing the Way the Internet Is Built