Operations 6 min read

Layered Architecture of Microservice Monitoring and Key Practices

This article explains the layered architecture of microservice monitoring, detailing five monitoring levels—from infrastructure to end-user experience—along with essential monitoring points such as logs, metrics, tracing, alerts, and health checks, and presents a typical monitoring stack using agents, Kafka, ELK, and InfluxDB.

Architect

Jan 2, 2021

Monitoring is a crucial part of microservice governance; a complete monitoring system directly affects service quality, reliability, and stability.

A well‑designed microservice monitoring system can be divided into five hierarchical layers:

1. Infrastructure Monitoring

This layer is usually handled by operations staff and covers low‑level hardware components such as networks, switches, and routers. Core metrics like traffic volume, packet loss, error rates, and connection counts are monitored to ensure the stability of higher‑level services.

2. System Monitoring

This layer includes physical machines, virtual machines, and operating systems. Typical metrics are CPU usage, memory usage, disk I/O, and network bandwidth.

3. Application Monitoring

This layer is closely related to the services themselves, monitoring URL performance, request counts, latency, error rates, slow SQL queries, cache hit rates, response times, and QPS for each service.

4. Business Monitoring

Business monitoring focuses on key business indicators, such as user login, registration, order placement, and payment success rates for an e‑commerce site, providing data for operational and strategic decision‑making.

5. End‑User Experience Monitoring

This layer tracks client‑side performance, return codes, geographic distribution, carrier conditions, device OS, browser versions, and other factors that affect the end‑user experience.

The five essential monitoring points are:

Log monitoring

Metrics monitoring

Tracing (call‑chain) monitoring

Alerting system

Health checks

A typical monitoring architecture places agents beside each microservice to collect metrics and logs, forwards the data through a message queue such as Kafka for decoupling and high availability, and stores logs with the ELK stack (Elasticsearch, Logstash, Kibana) and metrics in a time‑series database like InfluxDB. Frameworks such as Spring Boot expose health‑check endpoints that can be monitored by tools like Nagios or Zabbix.

Author: Chen Yuzhe. Source: https://juejin.im/post/6844903846192349191

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

monitoring Operations Observability metrics Logging

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.