Tagged articles
661 articles
Page 3 of 7
Spring Full-Stack Practical Cases
Spring Full-Stack Practical Cases
Jul 14, 2024 · Backend Development

Master Spring Boot Observability with @Timed, @Counted, and @MeterTag

Learn how to enable comprehensive observability in Spring Boot 3.2.5 by leveraging Micrometer’s @Timed, @Counted, and @MeterTag annotations, configuring Actuator endpoints, and customizing aspects to monitor method execution time, request counts, and parameters, complete with practical code examples and Prometheus integration.

MicrometerObservabilityPrometheus
0 likes · 7 min read
Master Spring Boot Observability with @Timed, @Counted, and @MeterTag
Alibaba Cloud Native
Alibaba Cloud Native
Jul 10, 2024 · Cloud Native

Migrate Self‑Hosted Prometheus + Thanos to Alibaba Cloud Managed Service

This guide explains how to move from a self‑built open‑source Prometheus + Thanos monitoring stack to Alibaba Cloud's fully managed Prometheus service, covering typical deployment scenarios, migration requirements, step‑by‑step procedures for metric collection, visualization, and alerting, and key considerations for each environment.

Alibaba CloudPrometheusThanos
0 likes · 15 min read
Migrate Self‑Hosted Prometheus + Thanos to Alibaba Cloud Managed Service
Cloud Native Technology Community
Cloud Native Technology Community
Jul 9, 2024 · Cloud Native

Answering the Top 9 Questions About Monitoring in Kubernetes

This article discusses essential Kubernetes monitoring topics, including cost tracking, tool selection, observability frameworks, responsibility allocation, baseline establishment, namespace best practices, the importance of monitoring, backup solutions, and a comparison of Datadog versus Splunk for metrics.

DatadogKubernetesObservability
0 likes · 6 min read
Answering the Top 9 Questions About Monitoring in Kubernetes
DevOps Operations Practice
DevOps Operations Practice
Jul 4, 2024 · Operations

Building an Enterprise‑Level Monitoring System: Requirements, Technology Selection, Architecture, Implementation Steps, and Maintenance

This article provides a comprehensive guide to designing and deploying an enterprise‑grade monitoring system, covering requirement analysis, tool selection such as Prometheus and Zabbix, system architecture, step‑by‑step implementation, alerting, visualization, and ongoing maintenance to ensure reliable IT operations.

AlertingGrafanaOperations
0 likes · 7 min read
Building an Enterprise‑Level Monitoring System: Requirements, Technology Selection, Architecture, Implementation Steps, and Maintenance
macrozheng
macrozheng
Jul 3, 2024 · Operations

How to Visualize SpringBoot Metrics with Grafana and Prometheus Using Docker

This guide walks through installing Grafana and Prometheus with Docker, configuring node_exporter to collect system metrics, adding SpringBoot Actuator and Micrometer for application metrics, setting up Prometheus scrape jobs, and importing ready‑made Grafana dashboards to achieve real‑time monitoring and alerting.

AlertingDockerGrafana
0 likes · 10 min read
How to Visualize SpringBoot Metrics with Grafana and Prometheus Using Docker
Efficient Ops
Efficient Ops
Jul 1, 2024 · Cloud Native

How to Monitor Business Metrics with Prometheus in Kubernetes

This article explains the concept of observability, details Prometheus metric definitions and types, and provides Go code examples for exposing, defining, generating, and scraping business‑level metrics in a Kubernetes‑based cloud‑native environment.

GoKubernetesMetrics
0 likes · 11 min read
How to Monitor Business Metrics with Prometheus in Kubernetes
Alibaba Cloud Observability
Alibaba Cloud Observability
Jun 20, 2024 · Cloud Native

How to Achieve Unified Multi‑Cluster Monitoring with Alibaba Cloud Prometheus and ACK One

This article explains how enterprises can use Alibaba Cloud's ACK One platform together with the Prometheus‑based Observability service to build a unified, cloud‑native monitoring solution for heterogeneous, multi‑region Kubernetes clusters, addressing scalability, cost, and operational challenges.

ACK OneCloud NativeKubernetes
0 likes · 12 min read
How to Achieve Unified Multi‑Cluster Monitoring with Alibaba Cloud Prometheus and ACK One
Java Architect Essentials
Java Architect Essentials
Jun 13, 2024 · Backend Development

Injecting Version Information into Java JARs Using a Compile‑Time Annotation Processor

This article demonstrates how to create a custom compile‑time annotation processor that automatically injects the JAR version into Java constants, enabling Prometheus monitoring of component versions without manual updates, and walks through the full implementation, registration, and testing steps.

Annotation ProcessingCompile-timeGradle
0 likes · 8 min read
Injecting Version Information into Java JARs Using a Compile‑Time Annotation Processor
DevOps Operations Practice
DevOps Operations Practice
May 30, 2024 · Operations

Introducing Karma: A Prometheus Alert Dashboard Tool

This article introduces Karma, a Docker‑deployed Prometheus alert dashboard that aggregates multiple Alertmanager instances, explains its installation requirements, and details key features such as visual alert aggregation, tag‑based grouping, and silence management, positioning it as a valuable operations tool.

Alert DashboardAlertmanagerDocker
0 likes · 4 min read
Introducing Karma: A Prometheus Alert Dashboard Tool
Tencent Cloud Developer
Tencent Cloud Developer
May 21, 2024 · Operations

Why Prometheus Metrics Aren’t 100% Accurate – The Hidden Trade‑offs Explained

The article analyzes why Prometheus sometimes returns inaccurate metric values, revealing the design trade‑offs that favor efficiency over precision, and walks through common pitfalls in rate/increase calculations, histogram P99 estimation, and practical recommendations for choosing scrape intervals and query windows.

HistogramMetricsObservability
0 likes · 20 min read
Why Prometheus Metrics Aren’t 100% Accurate – The Hidden Trade‑offs Explained
DevOps Operations Practice
DevOps Operations Practice
May 19, 2024 · Operations

High‑Availability Solutions for Prometheus Monitoring

Prometheus, a leading monitoring system, can achieve high availability through several common architectures—including dual-node with external storage, federated mode with external storage, and multi-node clusters combined with Thanos and object storage—each offering data persistence and load distribution to enhance system stability and performance.

External StoragePrometheusThanos
0 likes · 3 min read
High‑Availability Solutions for Prometheus Monitoring
Tongcheng Travel Technology Center
Tongcheng Travel Technology Center
May 6, 2024 · Operations

Using smart-doc to Generate JMeter Performance Test Scripts and Integrate with Prometheus and Grafana

This article explains how to leverage smart-doc to automatically generate JMeter performance testing scripts from API code, import them into JMeter, set up Prometheus monitoring and Grafana dashboards, and highlights the automation benefits for backend development and operations workflows.

API documentationGrafanaJMeter
0 likes · 7 min read
Using smart-doc to Generate JMeter Performance Test Scripts and Integrate with Prometheus and Grafana
Liangxu Linux
Liangxu Linux
May 1, 2024 · Operations

Master System & Application Monitoring with the USE Method and Prometheus

This guide explains how to build comprehensive system and application monitoring using the USE (Utilization‑Saturation‑Errors) method, outlines essential performance metrics, and walks through setting up a full monitoring stack with Prometheus, Grafana, and ELK components, including data collection, storage, alerting, and visualization.

ELKGrafanaPrometheus
0 likes · 15 min read
Master System & Application Monitoring with the USE Method and Prometheus
Alibaba Cloud Native
Alibaba Cloud Native
Apr 8, 2024 · Cloud Native

How to Build a Global View for Multiple Prometheus Instances – Community and Alibaba Cloud Solutions

This article explains why a global view is needed when Prometheus metrics are scattered across many instances, compares community approaches such as Federation, Thanos, and Remote Write, and details Alibaba Cloud's Global Aggregation Instance and Remote Write solutions with configuration examples and a real‑world case study.

FederationGlobal ViewPrometheus
0 likes · 25 min read
How to Build a Global View for Multiple Prometheus Instances – Community and Alibaba Cloud Solutions
Efficient Ops
Efficient Ops
Mar 27, 2024 · Operations

Master System Monitoring with the USE Method and Prometheus

This article explains how to design a comprehensive monitoring system using the concise USE (Utilization, Saturation, Errors) method, outlines essential system and application metrics, and demonstrates practical implementation with Prometheus, Grafana, and related open‑source tools.

PrometheusUSE methodmonitoring
0 likes · 14 min read
Master System Monitoring with the USE Method and Prometheus
DevOps Operations Practice
DevOps Operations Practice
Mar 25, 2024 · Operations

How to Monitor MySQL with Prometheus and Grafana

This tutorial explains how to install the MySQL Exporter, configure Prometheus to scrape MySQL metrics, set up Grafana dashboards for visualization, and define alerting rules for common MySQL performance indicators, providing a complete end‑to‑end monitoring solution.

AlertingExporterGrafana
0 likes · 5 min read
How to Monitor MySQL with Prometheus and Grafana
MaGe Linux Operations
MaGe Linux Operations
Mar 16, 2024 · Cloud Native

Scaling Non‑CPU‑Bound Apps with HPA Using cAdvisor Network Metrics

This guide shows how to enable Horizontal Pod Autoscaling for traffic‑driven workloads by leveraging cAdvisor's container network receive and transmit byte counters, converting them to per‑second rates with Prometheus‑adapter, and validating the custom metric through Kubernetes commands and console views.

Cloud NativeHPAKubernetes
0 likes · 7 min read
Scaling Non‑CPU‑Bound Apps with HPA Using cAdvisor Network Metrics
Practical DevOps Architecture
Practical DevOps Architecture
Mar 15, 2024 · Operations

Comprehensive Practical Guide to Prometheus Configuration, Optimization, and Source Code Development

This multi‑chapter guide provides in‑depth, hands‑on instruction for configuring and optimizing all Prometheus components, exploring Kubernetes monitoring, source‑code analysis, custom exporter development, high‑availability setups, service discovery, resource‑efficient scraping, and integrating Thanos for long‑term storage.

KubernetesObservabilityOperations
0 likes · 4 min read
Comprehensive Practical Guide to Prometheus Configuration, Optimization, and Source Code Development
DevOps Operations Practice
DevOps Operations Practice
Mar 14, 2024 · Operations

Resolving Frequent Crashes of a Single-Node Prometheus Deployment: Analysis and Solutions

This article analyzes why a single Prometheus instance repeatedly runs out of memory and crashes, explains the underlying storage mechanisms, and presents practical solutions such as metric reduction, retention tuning, federation architecture, and remote storage integration to improve stability and scalability.

FederationPrometheusmonitoring
0 likes · 6 min read
Resolving Frequent Crashes of a Single-Node Prometheus Deployment: Analysis and Solutions
Efficient Ops
Efficient Ops
Mar 3, 2024 · Operations

Mastering Prometheus: From Metrics Collection to Alerting and Visualization

This comprehensive guide explains Prometheus' architecture, metric collection models, storage format, query language (PromQL), alerting workflow, configuration reload methods, metric types, custom exporters, and how to visualise data with Grafana, providing a complete end‑to‑end monitoring solution.

GrafanaMetricsObservability
0 likes · 21 min read
Mastering Prometheus: From Metrics Collection to Alerting and Visualization
Efficient Ops
Efficient Ops
Feb 19, 2024 · Operations

Mastering Prometheus: Practical Tips for Effective Application Monitoring

This article explains how to design and implement Prometheus metrics for application monitoring, covering the selection of monitoring targets, golden metrics, label conventions, naming rules, histogram bucket choices, and Grafana visualization tricks to help engineers build reliable observability pipelines.

GrafanaMetricsObservability
0 likes · 10 min read
Mastering Prometheus: Practical Tips for Effective Application Monitoring
macrozheng
macrozheng
Feb 5, 2024 · Backend Development

Inject Jar Version into Java Components with Insertable Annotation Processors

This article demonstrates how to create a custom insertable annotation processor in Java to automatically inject the jar version into component constants at compile time, eliminating manual updates and enabling Prometheus monitoring of library usage across versions.

AnnotationProcessorCompileTimeGradle
0 likes · 9 min read
Inject Jar Version into Java Components with Insertable Annotation Processors
MaGe Linux Operations
MaGe Linux Operations
Jan 27, 2024 · Cloud Native

Istio Observability Made Easy: Prometheus, Jaeger & Kiali Guide

This guide walks through Istio's observability stack, showing how to configure Prometheus for metrics collection, deploy Jaeger for distributed tracing, and set up Kiali for visualizing the service mesh, while covering annotations, TLS settings, weighted routing, and configuration validation.

IstioKialiPrometheus
0 likes · 18 min read
Istio Observability Made Easy: Prometheus, Jaeger & Kiali Guide
MaGe Linux Operations
MaGe Linux Operations
Jan 25, 2024 · Operations

Mastering Monitoring: From Concepts to Prometheus in Operations

This article explains monitoring fundamentals, distinguishes black‑box and white‑box approaches, outlines key metrics and their aggregation, and provides a comprehensive guide to Prometheus architecture, data model, query language, and practical examples for CPU, memory, and disk usage monitoring.

MetricsObservabilityPrometheus
0 likes · 18 min read
Mastering Monitoring: From Concepts to Prometheus in Operations
Efficient Ops
Efficient Ops
Jan 22, 2024 · Operations

Mastering Monitoring: Black‑Box vs White‑Box, Metrics, and Prometheus in Practice

This guide explains monitoring fundamentals, clears common misconceptions, compares black‑box and white‑box approaches, outlines key metrics such as latency, traffic, errors and saturation, and provides a deep dive into Prometheus architecture, data model, query language, and practical examples for CPU, memory, and disk monitoring.

Prometheuscloud-nativemonitoring
0 likes · 15 min read
Mastering Monitoring: Black‑Box vs White‑Box, Metrics, and Prometheus in Practice
Linux Code Review Hub
Linux Code Review Hub
Jan 18, 2024 · Cloud Native

How to Build Unified Observability for Apache APISIX with DeepFlow

This article walks through deploying Apache APISIX and DeepFlow in a Kubernetes cluster, configuring eBPF‑based AutoTracing and OpenTelemetry integration, enabling Prometheus metrics, accessing logs and continuous profiling, and visualizing unified observability data via Grafana dashboards.

APISIXDeepFlowKubernetes
0 likes · 16 min read
How to Build Unified Observability for Apache APISIX with DeepFlow
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Jan 10, 2024 · Operations

Building Cloud Music's APM Metric Monitoring System Based on VictoriaMetrics

Cloud Music’s middleware team built the Pylon APM monitoring system on VictoriaMetrics, combining exporters, vmagent, Nacos, Flink‑based pre‑aggregation recording rules and vminsert for collection with Grafana, a custom Proxy and vmselect for querying, achieving millisecond‑level latency, metric‑trace correlation, stability improvements, and cost‑effective storage for nearly 700 million active time series.

APM monitoringFlinkMetric Pre-aggregation
0 likes · 12 min read
Building Cloud Music's APM Metric Monitoring System Based on VictoriaMetrics
dbaplus Community
dbaplus Community
Jan 8, 2024 · Backend Development

How We Built an Automated Payment Channel Management System with Redis and Prometheus

To handle growing payment traffic and unreliable third‑party gateways, the team at Zhuanzhuan designed an automated payment‑channel management platform that uses a custom Redis‑based time‑series store, Prometheus monitoring, and a sliding‑window failure‑rate algorithm to detect, alert, and eventually auto‑switch faulty channels.

AutomationPrometheusfault-tolerance
0 likes · 10 min read
How We Built an Automated Payment Channel Management System with Redis and Prometheus
Zhuanzhuan Tech
Zhuanzhuan Tech
Jan 5, 2024 · Operations

Building an Integrated Monitoring Platform: Architecture, Implementation, and Lessons from ZhaiZhai

This article presents a detailed case study of how ZhaiZhai designed and implemented a unified monitoring platform—combining business services, middleware, and operations resources—by selecting Prometheus and M3DB, automating Grafana dashboards, creating a low‑noise alerting system, and achieving large‑scale observability with significant cost and efficiency gains.

AlertingM3DBOperations
0 likes · 21 min read
Building an Integrated Monitoring Platform: Architecture, Implementation, and Lessons from ZhaiZhai
dbaplus Community
dbaplus Community
Jan 2, 2024 · Operations

How Xiaohongshu Scaled Its Metrics System Tenfold with Cloud‑Native Architecture

Facing exploding metric volumes, high resource consumption, and fragile operations, Xiaohongshu's observability team completely rebuilt its metrics pipeline using Victoriametrics, achieving ten‑fold performance gains, minute‑level scaling, high‑availability, cost reduction, and robust multi‑cloud active‑active deployment while preserving data safety and query speed.

MetricsObservabilityPrometheus
0 likes · 34 min read
How Xiaohongshu Scaled Its Metrics System Tenfold with Cloud‑Native Architecture
Efficient Ops
Efficient Ops
Dec 24, 2023 · Operations

Avoid These 6 Common Prometheus Mistakes When Getting Started

This guide translates and condenses six frequent errors new Prometheus users make—high‑cardinality labels, losing valuable tags during aggregation, using bare selectors, omitting the for field, choosing too‑short rate windows, and applying rate‑related functions to wrong metric types—offering practical fixes to improve monitoring reliability.

ObservabilityPromQLPrometheus
0 likes · 12 min read
Avoid These 6 Common Prometheus Mistakes When Getting Started
Efficient Ops
Efficient Ops
Dec 10, 2023 · Cloud Native

How to Build a Complete Kubernetes Monitoring Stack with Prometheus & Grafana

This guide walks through a full Kubernetes monitoring solution using cAdvisor, node_exporter, Prometheus, and Grafana, covering architecture, data collection, service discovery, deployment steps with DaemonSets, and detailed YAML configurations for a production‑ready observability stack.

GrafanaKubernetesPrometheus
0 likes · 6 min read
How to Build a Complete Kubernetes Monitoring Stack with Prometheus & Grafana
37 Interactive Technology Team
37 Interactive Technology Team
Dec 4, 2023 · Backend Development

Root Cause Analysis of Missing Trace Data in Go Services Using Prometheus Metrics and GZIP Compression

The missing trace data in two Go services was caused by the GoFrame tracing middleware recording the gzip‑compressed /metrics response body as a UTF‑8 string, which the OpenTelemetry exporter rejected as invalid UTF‑8; disabling Prometheus compression or decompressing the body before logging resolves the issue.

DebuggingGzipObservability
0 likes · 16 min read
Root Cause Analysis of Missing Trace Data in Go Services Using Prometheus Metrics and GZIP Compression
Efficient Ops
Efficient Ops
Nov 26, 2023 · Operations

Top Open‑Source Tools to Monitor HTTPS Certificate Expiration

This article reviews why HTTPS certificate expiration checks are often missed and introduces several open‑source monitoring tools—including blackbox_exporter, EaseProbe, uptime‑kuma, domain‑admin, and a simple shell script—to help operations teams ensure timely certificate renewal.

HTTPSPrometheuscertificate expiration
0 likes · 5 min read
Top Open‑Source Tools to Monitor HTTPS Certificate Expiration
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 8, 2023 · Cloud Native

How SLS Boosted Prometheus Query Performance Over 10× with Cloud‑Native Innovations

This article details the recent technical upgrades to Alibaba Cloud's SLS Prometheus storage engine, describing how compatibility with PromQL was retained while achieving more than tenfold query speed improvements, reducing costs through smarter aggregation writes, built‑in downsampling, global caching, parallel computation, and push‑down processing, and presenting benchmark comparisons with open‑source solutions.

Cloud NativePrometheusTime Series
0 likes · 17 min read
How SLS Boosted Prometheus Query Performance Over 10× with Cloud‑Native Innovations
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
Nov 7, 2023 · Operations

How NetEase Cloud Music Built Pylon APM: A Deep Dive into Tracing, Metrics, and Automated Diagnosis

This article details the design and implementation of the Pylon APM monitoring platform for NetEase Cloud Music, covering background challenges, the choice of Pinpoint, extensions to trace models, tail‑based exception sampling, Prometheus integration, automated JStack collection, and the resulting APM product features.

APMBackendJava Agent
0 likes · 12 min read
How NetEase Cloud Music Built Pylon APM: A Deep Dive into Tracing, Metrics, and Automated Diagnosis
Architect's Guide
Architect's Guide
Nov 6, 2023 · Operations

Comparison of Prometheus and Zabbix Monitoring Tools

This article compares the open‑source monitoring solutions Prometheus and Zabbix, outlining their histories, architectures, data collection methods, scalability, storage models, configuration complexity, community activity, and suitability for different environments such as traditional servers versus cloud‑native container platforms.

Cloud NativeOperationsPrometheus
0 likes · 8 min read
Comparison of Prometheus and Zabbix Monitoring Tools
MaGe Linux Operations
MaGe Linux Operations
Oct 27, 2023 · Cloud Native

Deploy Grafana and Prometheus on Kubernetes in Minutes

This guide walks you through preparing a Kubernetes cluster, creating deployment manifests, configuring Grafana and Prometheus, and verifying the monitoring setup, including code snippets and step‑by‑step commands for a seamless installation on a lightweight cloud server.

Cloud NativeDevOpsGrafana
0 likes · 7 min read
Deploy Grafana and Prometheus on Kubernetes in Minutes
Architect
Architect
Oct 25, 2023 · Operations

The Importance of Logging and Distributed Log Operations in Modern Architecture

This article explores why logs are essential in software development, outlines when to record them, discusses the value of logging in large-scale distributed systems, and examines the capabilities required of log‑operation tools such as APM, metrics, tracing, ELK, Prometheus, and custom batch querying solutions.

APMDistributed SystemsELK
0 likes · 21 min read
The Importance of Logging and Distributed Log Operations in Modern Architecture
Efficient Ops
Efficient Ops
Oct 24, 2023 · Operations

How to Monitor Business Metrics with Prometheus in Kubernetes

This article explains how to use Prometheus to monitor business‑level metrics in a Kubernetes environment, covering observability fundamentals, metric definitions, metric types, exposing metrics via a /metrics endpoint, and practical Go code examples for defining, recording, and scraping custom metrics.

GoKubernetesMetrics
0 likes · 11 min read
How to Monitor Business Metrics with Prometheus in Kubernetes
Ops Development Stories
Ops Development Stories
Oct 12, 2023 · Cloud Native

How to Monitor Kubernetes with OpenTelemetry Collector: Step‑by‑Step Helm Deployment

This guide walks through installing OpenTelemetry Collector on a Kubernetes cluster using Helm, configuring DaemonSet and Deployment collectors, integrating Prometheus for metrics, and customizing receivers, processors, and exporters to achieve comprehensive observability of nodes, pods, containers, and cluster resources.

KubernetesObservabilityOpenTelemetry
0 likes · 26 min read
How to Monitor Kubernetes with OpenTelemetry Collector: Step‑by‑Step Helm Deployment
Alibaba Cloud Native
Alibaba Cloud Native
Oct 10, 2023 · Operations

Mastering Memcached: Features, Use Cases, and Prometheus Monitoring

This article explains Memcached’s architecture, key characteristics, suitable and unsuitable scenarios, memory management and LRU mechanisms, version details, and provides a comprehensive guide to monitoring its performance and health using Prometheus and Alibaba Cloud ARMS dashboards.

Cloud NativeMemcachedOperations
0 likes · 26 min read
Mastering Memcached: Features, Use Cases, and Prometheus Monitoring
Liangxu Linux
Liangxu Linux
Oct 6, 2023 · Cloud Native

Why Are Kubernetes Pods Evicted? Preemption, Node Pressure & QoS Explained

This article explains why Kubernetes pods get evicted, covering preemptive eviction, node‑pressure eviction, pod scheduling, priority classes, QoS tiers, alternative eviction methods, and how to monitor evictions with Prometheus, providing concrete commands and examples.

PrometheusQoSnode pressure
0 likes · 11 min read
Why Are Kubernetes Pods Evicted? Preemption, Node Pressure & QoS Explained
Liangxu Linux
Liangxu Linux
Sep 24, 2023 · Operations

Understanding Prometheus Metric Types: Counters, Gauges, Histograms, and Summaries

This article explains the fundamentals of metrics, introduces dimensional metrics, compares Prometheus, OpenMetrics, and OpenTelemetry standards, and provides detailed guidance on the four Prometheus metric types—Counters, Gauges, Histograms, and Summaries—including their use‑cases, PromQL queries, and Python client examples.

CountersGaugesHistograms
0 likes · 18 min read
Understanding Prometheus Metric Types: Counters, Gauges, Histograms, and Summaries
The Dominant Programmer
The Dominant Programmer
Sep 21, 2023 · Backend Development

Essential SpringBoot Tricks: Flyway, JetCache, Netty, and More (Part 2)

This article compiles a set of practical SpringBoot techniques, including Flyway-based SQL version control, JetCache declarative caching, Netty WebSocket service customization, jasypt configuration encryption, ShardingSphere data masking, Jackson response desensitization, read‑write splitting, idempotent request handling, MockMvc testing, and Prometheus‑Grafana monitoring.

FlywayJetCacheNetty
0 likes · 3 min read
Essential SpringBoot Tricks: Flyway, JetCache, Netty, and More (Part 2)
HomeTech
HomeTech
Sep 19, 2023 · Operations

Implementing Observability and Alerting with Grafana Unified Alerting in a Cloud‑Native Service Mesh

This article explains how the automotive platform accelerated its cloud‑native service‑mesh transformation by integrating Opentelemetry, Prometheus, and Grafana, then details the configuration and practical use of Grafana's unified alerting module—including installation, data source setup, alert rule definition, contact points, message templates, and silencing—to achieve comprehensive observability and automated incident response.

AlertingGrafanaObservability
0 likes · 14 min read
Implementing Observability and Alerting with Grafana Unified Alerting in a Cloud‑Native Service Mesh
Zhuanzhuan Tech
Zhuanzhuan Tech
Sep 19, 2023 · Operations

Design and Implementation of an Integrated Monitoring System at ZhaiZhai Using Prometheus, Grafana, and M3DB

This article describes how ZhaiZhai unified dozens of legacy monitoring tools into a single, all‑in‑one observability platform by adopting Prometheus + Grafana, extending the Prometheus client to push metrics to M3DB, automating Grafana dashboard creation, and building a custom alerting service to reduce operational complexity and improve visibility across business, middleware, and infrastructure services.

AlertingGrafanaM3DB
0 likes · 21 min read
Design and Implementation of an Integrated Monitoring System at ZhaiZhai Using Prometheus, Grafana, and M3DB
Efficient Ops
Efficient Ops
Sep 17, 2023 · Cloud Native

Top 9 Essential Kubernetes Tools to Streamline Your Cloud‑Native Workflows

Explore nine indispensable Kubernetes tools—including Kubie, Kubespray, Helm, Minikube, K3s, Kustomize, KOps, Prometheus, and krew—that simplify cluster management, accelerate deployments, and enhance efficiency, helping you choose the right solution for smoother, more productive cloud‑native operations.

Cluster ManagementKubernetesPrometheus
0 likes · 6 min read
Top 9 Essential Kubernetes Tools to Streamline Your Cloud‑Native Workflows
Huolala Tech
Huolala Tech
Sep 14, 2023 · Operations

Designing an Effective UI for Monitoring Alerts: Insights from Huolala

This article shares Huolala's experience designing a unified monitoring platform UI, covering the evolution from open‑source dashboards to a fully self‑developed solution, simplification of PromQL, computed metrics, log and trace integration, and the challenges of alert configuration and visualization.

AlertingObservabilityOperations
0 likes · 16 min read
Designing an Effective UI for Monitoring Alerts: Insights from Huolala
MaGe Linux Operations
MaGe Linux Operations
Sep 13, 2023 · Cloud Native

Mastering Prometheus Metrics: Counters, Gauges, Histograms & Summaries Explained

This article introduces the fundamentals of metrics in IT monitoring, explains the structure of metric data points, explores dimensional metrics, and provides an in‑depth guide to Prometheus metric types—Counters, Gauges, Histograms, and Summaries—along with practical code examples and usage considerations in cloud‑native environments.

MetricsPrometheusmonitoring
0 likes · 19 min read
Mastering Prometheus Metrics: Counters, Gauges, Histograms & Summaries Explained
Efficient Ops
Efficient Ops
Sep 12, 2023 · Operations

Understanding Prometheus Metric Types: Counters, Gauges, Histograms & Summaries

This article explains how metrics are used to monitor software performance, introduces basic metric components and dimensional metrics, compares Prometheus, OpenMetrics and OpenTelemetry standards, and provides detailed guidance on Prometheus metric types—Counter, Gauge, Histogram, and Summary—with code examples and query patterns.

MetricsObservabilityPrometheus
0 likes · 18 min read
Understanding Prometheus Metric Types: Counters, Gauges, Histograms & Summaries
Architect
Architect
Sep 7, 2023 · Cloud Native

How Vivo Scaled Container Monitoring with Prometheus, Kafka, and VictoriaMetrics

This article details how Vivo's container platform faced exploding metric volumes, component overload, data gaps, and storage spikes, and explains the step‑by‑step architectural redesign, metric governance, performance tuning, cAdvisor redeployment, and VictoriaMetrics upgrade that restored high‑availability, low‑latency monitoring across a large Kubernetes fleet.

Cloud NativeKubernetesObservability
0 likes · 18 min read
How Vivo Scaled Container Monitoring with Prometheus, Kafka, and VictoriaMetrics
Alibaba Cloud Native
Alibaba Cloud Native
Sep 7, 2023 · Cloud Native

Unlock Real‑Time Container Network Monitoring with KubeSkoop’s eBPF Probes

This article explains how KubeSkoop leverages eBPF to provide low‑overhead, pod‑level network monitoring and real‑time diagnostics for Kubernetes clusters, covering packet flow fundamentals, traditional troubleshooting tool limitations, the exporter’s probe architecture, daily monitoring practices, and future development plans.

GrafanaKubeSkoopKubernetes
0 likes · 22 min read
Unlock Real‑Time Container Network Monitoring with KubeSkoop’s eBPF Probes
Sohu Tech Products
Sohu Tech Products
Aug 23, 2023 · Operations

Implementing Global Pulsar Client Monitoring with a SkyWalking Plugin

To give the business team a global, application‑level view of Pulsar performance, the team built a SkyWalking Java‑Agent plugin that automatically collects producer and consumer metrics from the Pulsar client, exposing latency, backlog and failure counts via Prometheus without modifying the client code.

JavaMetricsPrometheus
0 likes · 7 min read
Implementing Global Pulsar Client Monitoring with a SkyWalking Plugin
Efficient Ops
Efficient Ops
Aug 22, 2023 · Operations

Persisting Prometheus Alertmanager Alerts with Alertsnitch, MySQL, and Grafana

This article explains how Prometheus stores alerts only as time‑series data, why that limits historical queries, and provides a complete open‑source solution using Alertmanager, Alertsnitch, MySQL, and Grafana to persist, query, and visualize alerts in production environments.

Alert PersistenceAlertmanagerGrafana
0 likes · 10 min read
Persisting Prometheus Alertmanager Alerts with Alertsnitch, MySQL, and Grafana
vivo Internet Technology
vivo Internet Technology
Aug 16, 2023 · Cloud Native

Building a Scalable Container Monitoring System with Prometheus and VictoriaMetrics at vivo

The vivo Internet Container Team built a scalable, high‑availability container monitoring platform by deploying dual‑replica Prometheus clusters with a custom HA adapter, remoteWrite to VictoriaMetrics, and a Kafka forwarder, while cutting metric cardinality, tuning cAdvisor, and upgrading VictoriaMetrics to eliminate data loss and storage spikes, achieving stable, efficient monitoring.

Cloud NativeContainerKubernetes
0 likes · 16 min read
Building a Scalable Container Monitoring System with Prometheus and VictoriaMetrics at vivo
dbaplus Community
dbaplus Community
Jul 10, 2023 · Operations

Why Most Logging and Metrics Strategies Fail – and How to Fix Them

The author reflects on the shortcomings of current logging, metrics, and tracing practices, explains why they become costly and unscalable, and offers concrete recommendations—including log level discipline, structured logging, metric aggregation, and the use of tools like Prometheus, Cortex, and Thanos—to build a more efficient observability stack.

MetricsObservabilityPrometheus
0 likes · 18 min read
Why Most Logging and Metrics Strategies Fail – and How to Fix Them