Tagged articles
369 articles
Page 4 of 4
Top Architect
Top Architect
Aug 25, 2020 · Operations

Prometheus Monitoring in Kubernetes: Principles, Exporters, Configuration, Capacity Planning, and Best Practices

This comprehensive guide explores Prometheus as a cloud‑native monitoring solution for Kubernetes, covering core principles, exporter selection, configuration snippets, Grafana dashboard creation, capacity planning, high‑cardinality challenges, rate calculations, prediction functions, high‑availability designs, and integration with Alertmanager and other operational tools.

AlertmanagerExporterGrafana
0 likes · 38 min read
Prometheus Monitoring in Kubernetes: Principles, Exporters, Configuration, Capacity Planning, and Best Practices
Architecture Digest
Architecture Digest
Aug 25, 2020 · Operations

Best Practices and Advanced Topics for Prometheus Monitoring in Kubernetes

This article provides a comprehensive guide on using Prometheus for Kubernetes monitoring, covering fundamental principles, exporter selection, Grafana dashboard creation, memory and storage optimization, high‑availability designs, query performance, cardinality management, and integration with alerting and logging systems.

ExportersGrafanaKubernetes
0 likes · 33 min read
Best Practices and Advanced Topics for Prometheus Monitoring in Kubernetes
Java Architecture Diary
Java Architecture Diary
Aug 24, 2020 · Backend Development

Why Is Spring Boot Admin’s HTTP Trace Missing? How to Restore It

This article explains why the HTTP trace feature disappears in Spring Boot Admin after version 2.2.x, details the investigation steps that reveal the default disabling of the InMemoryHttpTraceRepository, and recommends using third‑party tracing solutions such as Prometheus with Grafana for observable metrics.

GrafanaHTTP TraceObservability
0 likes · 3 min read
Why Is Spring Boot Admin’s HTTP Trace Missing? How to Restore It
Programmer DD
Programmer DD
Jul 30, 2020 · Cloud Native

Master Prometheus: Practical Tips, Exporter Strategies, and Scaling Challenges

This comprehensive guide explores Prometheus monitoring fundamentals, key design principles, exporter selection for Kubernetes, advanced configuration tricks, capacity planning, high‑cardinality pitfalls, HA architectures, and integration with Grafana, Alertmanager, and Thanos to help you build reliable cloud‑native observability pipelines.

AlertingExporterGrafana
0 likes · 36 min read
Master Prometheus: Practical Tips, Exporter Strategies, and Scaling Challenges
dbaplus Community
dbaplus Community
Jul 26, 2020 · Big Data

How Prometheus Powers Scalable Monitoring for Massive Big Data Clusters

Facing thousands of nodes in expanding big‑data clusters, the author evaluates legacy monitoring stacks, selects Prometheus + Alertmanager + Grafana, and details its architecture, custom exporters, real‑time alerts, self‑healing mechanisms, and visual dashboards that now support ten large clusters and dozens of services.

AlertmanagerBig DataGrafana
0 likes · 11 min read
How Prometheus Powers Scalable Monitoring for Massive Big Data Clusters
Cloud Native Technology Community
Cloud Native Technology Community
Jul 15, 2020 · Operations

Building a Complete Linux Monitoring Dashboard with Prometheus, Pushgateway, and Grafana

This tutorial shows Linux system administrators and DevOps engineers how to create a fully customizable, distributed monitoring dashboard by installing and configuring Prometheus, Pushgateway, and Grafana, writing a Bash script to push process metrics, and visualizing CPU and memory usage with Grafana panels and PromQL queries.

BashDevOpsGrafana
0 likes · 14 min read
Building a Complete Linux Monitoring Dashboard with Prometheus, Pushgateway, and Grafana
dbaplus Community
dbaplus Community
Jul 13, 2020 · Operations

14 Expert Q&A on Building an Effective SRE System for Fault Management

In this detailed Q&A, a Meitu SRE leader explains the relationship between DevOps and SRE, shares practical advice on team composition, monitoring, alerting, fault‑prevention design, and provides step‑by‑step guidance using Grafana, draw.io, and other tools to help organizations build reliable services.

DevOpsGrafanaSRE
0 likes · 10 min read
14 Expert Q&A on Building an Effective SRE System for Fault Management
Top Architect
Top Architect
Jul 5, 2020 · Operations

Enterprise Log Monitoring System Architecture for Microservice Environments

The article describes an enterprise‑grade log monitoring solution that unifies log collection, filtering, cleaning, and visualization across hundreds of microservices using tools such as Filebeat, Elastic APM, Kafka Streams, Prometheus, Grafana and Kibana to improve troubleshooting, performance analysis, and operational efficiency.

FilebeatGrafanaLog Monitoring
0 likes · 8 min read
Enterprise Log Monitoring System Architecture for Microservice Environments
Efficient Ops
Efficient Ops
Jun 10, 2020 · Operations

Automate Grafana Dashboard Snapshots & Email Reports with Puppeteer

This guide explains how to use Node.js, Puppeteer, and Nodemailer to capture Grafana panel images, generate email reports, and schedule automated deliveries, covering environment setup, code modules, screenshot techniques, font handling, and optional cron integration for continuous monitoring.

GrafanaNode.jsPuppeteer
0 likes · 14 min read
Automate Grafana Dashboard Snapshots & Email Reports with Puppeteer
Efficient Ops
Efficient Ops
May 19, 2020 · Cloud Native

Mastering Prometheus on Kubernetes: Practical Tips, Exporter Guide, and Capacity Planning

This article explores the history and principles of Prometheus monitoring, offers guidance on version selection, highlights its limitations, details common Kubernetes exporters, shows Grafana dashboard setups, and provides in‑depth strategies for exporter aggregation, golden metrics, multi‑cluster scraping, GPU monitoring, timezone handling, memory optimization, capacity planning, and rate calculations.

GrafanaKubernetesPrometheus
0 likes · 19 min read
Mastering Prometheus on Kubernetes: Practical Tips, Exporter Guide, and Capacity Planning
MaGe Linux Operations
MaGe Linux Operations
May 10, 2020 · Databases

How to Build a Complete MySQL Monitoring Dashboard with Prometheus and Grafana

This guide walks through deploying mysqld_exporter, configuring Prometheus and Grafana, and monitoring essential MySQL metrics such as replication health, query throughput, slow‑query counts, connection usage, and InnoDB buffer‑pool statistics, while also showing how to set up alert rules for proactive database operations.

AlertingExportersGrafana
0 likes · 15 min read
How to Build a Complete MySQL Monitoring Dashboard with Prometheus and Grafana
dbaplus Community
dbaplus Community
Apr 25, 2020 · Operations

Master Blackbox Exporter: Install, Configure, and Monitor with Prometheus

This guide explains the concepts of white‑box and black‑box monitoring, introduces Prometheus Blackbox Exporter, walks through installation, systemd setup, and detailed Prometheus configurations for HTTP, TCP, ICMP, POST and SSL checks, shows Grafana dashboard integration, and provides alert rule examples for reliable service health monitoring.

AlertingBlackbox ExporterGrafana
0 likes · 13 min read
Master Blackbox Exporter: Install, Configure, and Monitor with Prometheus
Efficient Ops
Efficient Ops
Apr 6, 2020 · Databases

How to Build a MySQL Monitoring Platform with Prometheus and Grafana

This article walks through setting up a production‑grade MySQL monitoring solution using Prometheus and Grafana, covering exporter installation, MySQL user configuration, systemd service setup, Prometheus job definition, key MySQL performance metrics, and basic alerting rules.

GrafanaMetricsPrometheus
0 likes · 15 min read
How to Build a MySQL Monitoring Platform with Prometheus and Grafana
360 Quality & Efficiency
360 Quality & Efficiency
Apr 3, 2020 · Operations

Prometheus Monitoring System: Concepts, Architecture, and Hands‑On Deployment with Node Exporter and Grafana

This article introduces the core concepts and architecture of the open‑source Prometheus monitoring system, explains its data model and metric types, and provides a step‑by‑step guide to install a Prometheus server, collect host metrics with Node Exporter, and visualize them using Grafana.

GrafanaMetricsObservability
0 likes · 10 min read
Prometheus Monitoring System: Concepts, Architecture, and Hands‑On Deployment with Node Exporter and Grafana
Efficient Ops
Efficient Ops
Mar 8, 2020 · Operations

Prometheus vs Zabbix: Install, Configure & Visualize with Grafana

This article compares Prometheus with Zabbix, walks through downloading and installing Prometheus, explains the key sections of prometheus.yml, shows how to add a node_exporter for machine metrics, and demonstrates integrating Grafana to create rich monitoring dashboards.

GrafanaLinuxPrometheus
0 likes · 11 min read
Prometheus vs Zabbix: Install, Configure & Visualize with Grafana
Programmer DD
Programmer DD
Mar 4, 2020 · Frontend Development

Customize Grafana Themes Without Rebuilding the Source Code

This guide walks you through a step‑by‑step method to add and switch custom Grafana themes using the Boom Theme panel plugin and ready‑made theme packs from GitHub, enabling theme changes across dashboards without modifying Grafana's source code.

GrafanaTheme Customizationfrontend development
0 likes · 5 min read
Customize Grafana Themes Without Rebuilding the Source Code
Efficient Ops
Efficient Ops
Feb 24, 2020 · Operations

How to Build an Effective Operations Monitoring Platform: Tools, Design, and Best Practices

This article explains why monitoring is essential for operations, reviews popular monitoring tools such as Cacti, Nagios, Zabbix, Ganglia, Centreon, Prometheus and Grafana, outlines a six‑layer unified monitoring platform architecture, offers selection guidance for different enterprise sizes, and shares evolution lessons from small to large scale deployments.

DevOpsGrafanaOperations
0 likes · 20 min read
How to Build an Effective Operations Monitoring Platform: Tools, Design, and Best Practices
Programmer DD
Programmer DD
Feb 16, 2020 · Operations

How to Monitor Redis with Prometheus and Grafana: Step-by-Step Guide

Learn how to set up Prometheus and Grafana to monitor Redis instances by installing the redis_exporter plugin, configuring Prometheus scrape jobs, handling build issues, and visualizing metrics with ready-made Grafana dashboards, all illustrated with code snippets and configuration examples.

ConfigurationExporterGrafana
0 likes · 4 min read
How to Monitor Redis with Prometheus and Grafana: Step-by-Step Guide
Programmer DD
Programmer DD
Feb 15, 2020 · Operations

Understanding Prometheus: Architecture, Data Model, and Alerting Explained

This article provides a comprehensive overview of Prometheus, covering its open‑source monitoring architecture, multi‑dimensional data model, query language, storage mechanisms, service discovery, alerting workflow with Alertmanager, and visualization using Grafana, all illustrated with key diagrams and configuration examples.

AlertingGrafanaMetrics
0 likes · 9 min read
Understanding Prometheus: Architecture, Data Model, and Alerting Explained
Java High-Performance Architecture
Java High-Performance Architecture
Feb 10, 2020 · Backend Development

How to Monitor Spring Boot Apps with Prometheus and Grafana: Step‑by‑Step Guide

This tutorial walks through building a Spring Boot application, integrating Micrometer for metric collection, deploying Prometheus and Grafana via Docker, configuring dynamic service discovery, and creating custom request‑count metrics with AOP, providing a complete end‑to‑end monitoring solution.

DockerGrafanaMicrometer
0 likes · 15 min read
How to Monitor Spring Boot Apps with Prometheus and Grafana: Step‑by‑Step Guide
360 Tech Engineering
360 Tech Engineering
Jan 7, 2020 · Operations

Introduction to Prometheus and Grafana for Monitoring and Alerting

This article provides a comprehensive overview of using Prometheus and Grafana for metric collection, storage, querying with PromQL, visualization, and alerting, including exporter integration, metric types, high‑availability setups, and practical examples for modern microservice architectures.

GrafanaMetricsPrometheus
0 likes · 10 min read
Introduction to Prometheus and Grafana for Monitoring and Alerting
Huajiao Technology
Huajiao Technology
Dec 17, 2019 · Backend Development

Diagnosing Java Memory Leaks: JVM GC Roots, Monitoring with Spring Boot Actuator, Prometheus, Grafana, and MAT

This article explains how Java memory leaks can occur despite automatic garbage collection, describes JVM reachability analysis, shows how to monitor and detect leaks using Spring Boot Actuator, Prometheus, and Grafana, and provides step‑by‑step instructions for heap dump analysis and code fixes.

Garbage CollectionGrafanaJVM
0 likes · 11 min read
Diagnosing Java Memory Leaks: JVM GC Roots, Monitoring with Spring Boot Actuator, Prometheus, Grafana, and MAT
MaGe Linux Operations
MaGe Linux Operations
Nov 26, 2019 · Operations

Master Prometheus: From Basics to Advanced Configuration and Alerts

This article introduces Prometheus, an open‑source monitoring system, explains its core components such as server, exporters, and Alertmanager, provides step‑by‑step installation and configuration instructions, demonstrates alert rule setup, and shows integration with tools like Grafana, Telegraf, Spring Boot and Canal.

AlertmanagerDevOpsGrafana
0 likes · 10 min read
Master Prometheus: From Basics to Advanced Configuration and Alerts
dbaplus Community
dbaplus Community
Oct 28, 2019 · Big Data

Quickly Analyze Hadoop NameNode RPC with ELK and Grafana

This guide shows how to reduce excessive NameNode RPC calls caused by frequent HDFS directory listings and demonstrates a complete ELK pipeline—Filebeat, Kafka/Logstash, Elasticsearch, and Kibana—plus Grafana dashboards for real‑time monitoring of Hadoop RPC operations.

ELKGrafanaHadoop
0 likes · 9 min read
Quickly Analyze Hadoop NameNode RPC with ELK and Grafana
Programmer DD
Programmer DD
Oct 10, 2019 · Operations

What’s New in Grafana 6.4? Explore the Latest Features and Improvements

Grafana 6.4, released on October 2 2019, introduces a suite of enhancements—including Explore navigation, real‑time log viewing, new log panels, Data Link upgrades, Series Override line rendering, shared query results, an Alpine‑based Docker image, deprecation of PhantomJS, and the Alpha release of grafana‑toolkit—plus numerous UI and performance improvements.

DashboardGrafanaObservability
0 likes · 7 min read
What’s New in Grafana 6.4? Explore the Latest Features and Improvements
Efficient Ops
Efficient Ops
Oct 8, 2019 · Operations

Build a Docker Container Monitoring Stack with CAdvisor, InfluxDB, Grafana

To effectively monitor Dockerized services, this guide walks through selecting a monitoring solution, deploying CAdvisor, integrating it with InfluxDB for persistent storage, visualizing metrics via Grafana, and addressing common issues such as missing utilities, memory stats, and network traffic inaccuracies.

GrafanaInfluxDBOperations
0 likes · 15 min read
Build a Docker Container Monitoring Stack with CAdvisor, InfluxDB, Grafana
DevOps Cloud Academy
DevOps Cloud Academy
Jun 20, 2019 · Operations

Step-by-Step Installation and Configuration of Node Exporter, Alertmanager, Prometheus, and Grafana for Monitoring and Alerting

This guide walks through downloading, extracting, and setting up Node Exporter, Alertmanager, Prometheus, and Grafana on a Linux server, configuring their systemd services, customizing alert rules, and verifying the monitoring and alerting pipeline with screenshots of each verification step.

AlertmanagerGrafanaOperations
0 likes · 7 min read
Step-by-Step Installation and Configuration of Node Exporter, Alertmanager, Prometheus, and Grafana for Monitoring and Alerting
dbaplus Community
dbaplus Community
Apr 24, 2019 · Operations

Choosing and Tuning Open‑Source Monitoring Stacks for Large‑Scale Operations

This article reviews common open‑source monitoring tools, shares the evolution of China Unicom's big‑data platform monitoring, and provides practical guidance on selecting collectors, databases, and visualization components, with detailed configurations for Prometheus, Alertmanager, Grafana, and automation recovery techniques.

AlertmanagerGrafanaInfluxDB
0 likes · 19 min read
Choosing and Tuning Open‑Source Monitoring Stacks for Large‑Scale Operations
58 Tech
58 Tech
Apr 19, 2019 · Operations

Prometheus-Based Monitoring Solution for the 58 Cloud Search Platform

This article describes the challenges of scaling the 58 Cloud Search service, explains why Prometheus was selected as the monitoring stack, and details the architecture, data collection, storage, alerting, visualization, and future enhancements of the resulting cloud‑native monitoring system.

AlertmanagerCloud NativeGrafana
0 likes · 12 min read
Prometheus-Based Monitoring Solution for the 58 Cloud Search Platform
Efficient Ops
Efficient Ops
Apr 18, 2019 · Operations

Choosing the Right Monitoring Stack: From Nagios to Prometheus & Grafana

This article reviews common open‑source monitoring combinations, compares their strengths and weaknesses, and shares practical guidance on selecting collectors, storage back‑ends, and visualization tools such as Telegraf, InfluxDB, Prometheus, Grafana, and alertmanager for large‑scale data platform operations.

GrafanaInfluxDBNagios
0 likes · 12 min read
Choosing the Right Monitoring Stack: From Nagios to Prometheus & Grafana
JD Tech
JD Tech
Jan 3, 2019 · Operations

Comprehensive Monitoring Strategies for E‑commerce Platforms: Black‑Box and White‑Box Approaches

This article systematically explains how to enhance e‑commerce platform availability by implementing both black‑box monitoring to detect functional failures and white‑box monitoring to pinpoint root causes, detailing core order‑process metrics, common issues, mitigation strategies, and illustrative Grafana dashboards.

GrafanaOperationsSRE
0 likes · 9 min read
Comprehensive Monitoring Strategies for E‑commerce Platforms: Black‑Box and White‑Box Approaches
Liulishuo Tech Team
Liulishuo Tech Team
Dec 14, 2018 · Mobile Development

Engineering Practice: Building an Android Application Performance Management (APM) Dashboard

This article details the architectural design and engineering practices behind building a comprehensive Application Performance Management dashboard for Android applications, covering real-time monitoring, version comparison, development cycle tracking, automated data collection, and integrated test coverage analysis to ensure sustainable software quality and delivery efficiency.

APMAndroid DevelopmentGrafana
0 likes · 21 min read
Engineering Practice: Building an Android Application Performance Management (APM) Dashboard
Architects' Tech Alliance
Architects' Tech Alliance
Jan 14, 2018 · Operations

Why Some Developers Keep Coding After 40 and How Grafana Powers Their Monitoring Projects

While many believe software development ends after age 40, the article highlights veteran programmers who treat coding as a lifelong passion and showcases Dennis’s Grafana‑based monitoring solutions for Huawei storage, illustrating how open‑source dashboards, SNMP data collection, and comparisons with Kibana empower modern ops.

DevOpsGrafanaKibana
0 likes · 7 min read
Why Some Developers Keep Coding After 40 and How Grafana Powers Their Monitoring Projects
dbaplus Community
dbaplus Community
Nov 19, 2017 · Operations

Designing Scalable Monitoring with ELK and GPE: A Practical Guide

This article outlines a large‑scale monitoring solution for distributed microservice environments, comparing traditional ELK logging with a custom GPE stack (Grafana, Prometheus, Exporter, Consul), detailing architecture, components, workflows, and practical considerations for reliable observability.

ELKGrafanaPrometheus
0 likes · 10 min read
Designing Scalable Monitoring with ELK and GPE: A Practical Guide
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Aug 30, 2017 · Operations

Mastering Prometheus: From Metrics Basics to High‑Availability Monitoring

This article shares practical experiences of using Prometheus for monitoring complex services, covering metric types, PromQL query techniques, naming conventions, service discovery with file‑based configs, high‑availability sharding, alerting via Alertmanager, and visualisation with Grafana, providing actionable guidance for reliable observability.

GrafanaPromQLPrometheus
0 likes · 15 min read
Mastering Prometheus: From Metrics Basics to High‑Availability Monitoring
Efficient Ops
Efficient Ops
Jun 11, 2017 · Operations

How Bilibili Scaled Its Ops: From DIY Deployments to Prometheus Monitoring

From early manual deployments to a sophisticated, multi-layered monitoring stack—including ELK, Zabbix, Statsd, Grafana, and Prometheus—Bilibili’s ops team shares the evolution, challenges, and lessons learned in building scalable, automated infrastructure for massive internet traffic.

DevOpsELKGrafana
0 likes · 8 min read
How Bilibili Scaled Its Ops: From DIY Deployments to Prometheus Monitoring

Building a Scalable Business Monitoring System: Architecture, Modules & Lessons

This article presents a comprehensive case study of a business monitoring system, covering its background, architectural analysis, module design, time‑series database selection, visualization with Grafana, alerting strategies, decision‑making logic, and intelligent monitoring experiments, followed by key takeaways and lessons learned.

GrafanaInfluxDBOperations
0 likes · 12 min read
Building a Scalable Business Monitoring System: Architecture, Modules & Lessons
dbaplus Community
dbaplus Community
Aug 19, 2016 · Operations

Unlocking System Reliability: The Value and Complete Architecture of Monitoring for Containers

This article explains why monitoring is essential for system reliability, outlines the key components of a comprehensive monitoring framework, compares data collection methods, and presents practical container monitoring solutions—from Docker stats to cAdvisor with InfluxDB and Grafana, as well as Kubernetes and Mesos integrations.

GrafanaKubernetesPrometheus
0 likes · 14 min read
Unlocking System Reliability: The Value and Complete Architecture of Monitoring for Containers