Tagged articles

service monitoring

17 articles · Page 1 of 1

Mar 14, 2025 · Cloud Native

Integrating Hystrix Dashboard with Spring Cloud for Visual Service Circuit Breaker Monitoring – A Hands‑On Guide

This article walks through the background of Hystrix, shows how to add Hystrix Dashboard and Turbine dependencies to a Spring Cloud consumer project, configure the necessary annotations and properties, test the services, and use the dashboard and Turbine streams to visualize circuit‑breaker metrics for both single instances and clusters.

HystrixHystrix DashboardTurbine

0 likes · 12 min read

Integrating Hystrix Dashboard with Spring Cloud for Visual Service Circuit Breaker Monitoring – A Hands‑On Guide

Open Source Linux

Mar 4, 2025 · Operations

How Kener Transforms Service Monitoring with a Self‑Hosted Status Page

Kener is a modern self‑hosted status page tool that offers real‑time service monitoring, flexible event management, and easy deployment via manual setup or Docker, helping teams improve transparency, collaboration, and reliability of their services.

Operationsself‑hostedservice monitoring

0 likes · 4 min read

How Kener Transforms Service Monitoring with a Self‑Hosted Status Page

vivo Internet Technology

Feb 19, 2025 · Backend Development

vivo HTTPDNS End-to-End Integrated Solution: Architecture, Optimizations, and Business Impact

vivo’s end‑to‑end HTTPDNS solution integrates a client SDK, high‑performance service, unified scheduling gateway, and full‑link monitoring, cutting resolution latency by 36%, boosting DNS success to 99.85%, handling 1.5 billion queries daily, and preventing hijacking and misblocking across its ecosystem.

Backend DevelopmentDNS ResolutionHTTPDNS

0 likes · 19 min read

vivo HTTPDNS End-to-End Integrated Solution: Architecture, Optimizations, and Business Impact

NetEase Cloud Music Tech Team

Nov 7, 2023 · Operations

How NetEase Cloud Music Built Pylon APM: A Deep Dive into Tracing, Metrics, and Automated Diagnosis

This article details the design and implementation of the Pylon APM monitoring platform for NetEase Cloud Music, covering background challenges, the choice of Pinpoint, extensions to trace models, tail‑based exception sampling, Prometheus integration, automated JStack collection, and the resulting APM product features.

APMJava AgentMetrics

0 likes · 12 min read

How NetEase Cloud Music Built Pylon APM: A Deep Dive into Tracing, Metrics, and Automated Diagnosis

Java High-Performance Architecture

Jan 25, 2022 · Cloud Native

Why Is Debugging Microservices on Kubernetes So Hard? Proven Strategies to Overcome It

Debugging microservices in a Kubernetes environment is challenging due to the abstraction of pods, network complexities, infrastructure issues, and application-level faults, but by monitoring at the service layer, aggregating data, and applying machine‑learning‑based anomaly detection, teams can effectively identify and resolve problems.

KubernetesMicroservicesTroubleshooting

0 likes · 6 min read

Why Is Debugging Microservices on Kubernetes So Hard? Proven Strategies to Overcome It

Ops Development Stories

May 25, 2021 · Backend Development

Mastering Sentinel Console: Real‑World Guide to Machine Monitoring, Flow Control, and Rule Configuration

This comprehensive tutorial walks you through Sentinel's console features, from environment setup and machine health monitoring to real‑time service metrics, flow‑control, degrade, hotspot, authorization, system, and cluster rules, including practical code examples and configuration tips for Spring Cloud Alibaba applications.

Sentinelflow-controlservice monitoring

0 likes · 31 min read

Mastering Sentinel Console: Real‑World Guide to Machine Monitoring, Flow Control, and Rule Configuration

vivo Internet Technology

Apr 21, 2021 · Operations

System Health Check: Principles and Implementation

System health checks, akin to medical exams, are vital for modern IT infrastructure, using active and passive monitoring, failover strategies, and tools like Spring Boot Actuator to detect hardware, network, load, or software issues, prevent single points of failure, and ensure continuous high‑availability service operation.

High AvailabilityMonitoringNetwork Reliability

0 likes · 12 min read

System Health Check: Principles and Implementation

NetEase Smart Enterprise Tech+

Feb 4, 2021 · Operations

How NetEase Cloud Communication Builds a Real-Time Service Monitoring Platform

NetEase Cloud Communication’s service monitoring platform leverages data collection, preprocessing, alerting, and visualization pipelines—using HTTP APIs, Kafka, custom scripts, and NTSDB—to provide real-time insights, ensure stability, and support scalable, high‑throughput audio‑video services.

Operationscloud communicationdata pipeline

0 likes · 11 min read

How NetEase Cloud Communication Builds a Real-Time Service Monitoring Platform

Open Source Linux

Sep 21, 2020 · Operations

Mastering Nginx Health Checks: Configuring Reliable Service Monitoring

Learn how to configure Nginx’s three health‑check methods—TCP, HTTP, and custom—by understanding each parameter such as interval, fall, rise, timeout, and type, and see a complete upstream example that ensures timely detection of unhealthy service nodes.

Configurationbackendhealth-check

0 likes · 7 min read

Mastering Nginx Health Checks: Configuring Reliable Service Monitoring

IT Architects Alliance

Sep 14, 2020 · Operations

Implementation of Service Chain Monitoring and End-to-End Process Monitoring

This article explains how to design and implement service‑chain (APM) monitoring and end‑to‑end process monitoring in distributed systems, covering concepts such as spans and traces, TRACE_ID generation, logging practices, visualisation techniques, and a practical expense‑report use case with code examples.

APMDistributed TracingMicroservices

0 likes · 15 min read

Implementation of Service Chain Monitoring and End-to-End Process Monitoring

Yanxuan Tech Team

May 25, 2020 · Operations

How NetEase Cloud Music Built a Scalable Full‑Link Tracing System for Real‑Time Service Diagnosis

This article details the design, implementation, and evolution of NetEase Cloud Music's full‑link tracing platform, covering its motivations, architecture, low‑overhead data collection, multi‑dimensional analysis, service grooming, automated diagnosis, and future plans for AI‑driven anomaly detection and big‑data processing.

ObservabilityTracingdistributed systems

0 likes · 19 min read

How NetEase Cloud Music Built a Scalable Full‑Link Tracing System for Real‑Time Service Diagnosis

Efficient Ops

Apr 1, 2020 · Operations

How to Use Nagios for Business-Level Service Monitoring: A Step-by-Step Guide

This article explains why traditional server and service monitoring (e.g., Zabbix) may miss business outages, then walks through setting up Nagios on Debian to monitor web page URLs, API health checks, and related services, including configuration files, plugins, and a desktop alert tool, Nagstamon.

LinuxMonitoringOps

0 likes · 18 min read

How to Use Nagios for Business-Level Service Monitoring: A Step-by-Step Guide

Sohu Tech Products

Sep 18, 2019 · Backend Development

Practical Applications of OpenResty: Blacklist, Rate Limiting, AB Testing, and Service Quality Monitoring

This article demonstrates how OpenResty can be used in production to implement static and dynamic blacklists, request rate limiting, AB testing, and service quality monitoring by embedding Lua scripts into Nginx, with detailed configuration examples and code snippets.

AB testingBlacklistLua

0 likes · 23 min read

Practical Applications of OpenResty: Blacklist, Rate Limiting, AB Testing, and Service Quality Monitoring

Qunar Tech Salon

Sep 5, 2018 · Operations

Tencent SNG Operations: Business Profiling for Capacity Planning, Activity Modeling, and Multi‑Region Deployment

The article explains how Tencent's SNG operations team uses business profiling—including capacity, activity, core‑link, and SET models—to address performance testing across device types, forecast activity‑driven resource needs, identify core versus peripheral services, and plan reliable multi‑region deployments.

Operationsbusiness profilingcapacity planning

0 likes · 9 min read

Tencent SNG Operations: Business Profiling for Capacity Planning, Activity Modeling, and Multi‑Region Deployment

Meituan Technology Team

May 31, 2018 · Operations

High‑Availability Practices for Account Services at Meituan/Dianping

Meituan/Dianping ensures its critical account service stays online by combining real‑time business monitoring, circuit‑breaker‑driven graceful degradation, and active‑active cross‑region deployment with isolated dependencies, versioned data sync, and automated cache updates, dramatically extending MTBF while cutting MTTR and latency.

Data synchronizationHigh Availabilityfault tolerance

0 likes · 13 min read

High‑Availability Practices for Account Services at Meituan/Dianping

dbaplus Community

Oct 10, 2017 · Operations

How to Build Effective Service Monitoring: Principles, Practices, and Technical Implementation

This article explains why service monitoring is essential for large‑scale microservice environments, outlines design principles, core monitoring components, dependency mapping, call‑chain analysis, capacity planning, root‑cause analysis, and presents a practical technical architecture for implementing robust monitoring solutions.

Distributed TracingOperationscapacity planning

0 likes · 12 min read

How to Build Effective Service Monitoring: Principles, Practices, and Technical Implementation

dbaplus Community

Nov 28, 2016 · Backend Development

Adaptive Service Monitoring and Self‑Healing Calls for Microservices

This article explains how to implement service context monitoring and runtime awareness, capture performance metrics via automatic discovery, transmit data through heartbeats or message queues, and apply adaptive mechanisms such as load balancing, circuit breaking, retry and isolation to achieve resilient microservice communication.

Adaptive routingcircuit breakingload balancing

0 likes · 17 min read

Adaptive Service Monitoring and Self‑Healing Calls for Microservices