Tagged articles

prometheus

691 articles · Page 7 of 7
Efficient Ops
Efficient Ops
Jul 5, 2020 · Operations

Why Loki Beats ELK for Container Cloud Logging: A Deep Dive

This article explains how Loki, a lightweight Grafana‑based log system, addresses the heavy resource usage and complexity of ELK/EFK in Kubernetes environments by simplifying architecture, reducing cost, and improving log‑metric integration for faster incident response.

Loggingkubernetesloki
0 likes · 7 min read
Why Loki Beats ELK for Container Cloud Logging: A Deep Dive
Aikesheng Open Source Community
Aikesheng Open Source Community
Jun 22, 2020 · Operations

Introduction to the Prometheus Data Collection Process

This article explains the complete Prometheus data collection workflow, covering key concepts such as targets, samples, and meta labels, detailing the relabeling steps, configuration options, example use‑cases, and the final scrape and storage phases for effective monitoring.

ConfigurationMonitoringdata collection
0 likes · 8 min read
Introduction to the Prometheus Data Collection Process
Alibaba Cloud Native
Alibaba Cloud Native
Jun 19, 2020 · Cloud Native

Kubernetes News Digest: Anti‑Discrimination Docs, 1.19 Beta Freeze, Linkerd 2.8, and New Open‑Source Tools

This roundup highlights recent Kubernetes ecosystem changes, including the addition of anti‑discrimination statements to documentation, the upcoming 1.19.0‑beta.1 code freeze, Linkerd 2.8 multi‑cluster support, several upstream enhancements, and curated open‑source project and reading recommendations for cloud‑native practitioners.

Cloud NativeLinkerdkubernetes
0 likes · 5 min read
Kubernetes News Digest: Anti‑Discrimination Docs, 1.19 Beta Freeze, Linkerd 2.8, and New Open‑Source Tools
iQIYI Technical Product Team
iQIYI Technical Product Team
Jun 12, 2020 · Operations

Microservice Monitoring Practices at iQIYI: Architecture, Metrics, and Automation

iQIYI’s micro‑service monitoring combines low‑cost automatic instrumentation, declarative method metrics, and push‑gateway data into a unified multi‑dimensional schema, visualized centrally in Grafana and managed with standardized alert rules, demonstrating that simple integration, centralized dashboards, and early‑stage governance enable rapid anomaly detection and effective incident response.

AlertingMetricscloud-native
0 likes · 14 min read
Microservice Monitoring Practices at iQIYI: Architecture, Metrics, and Automation
Efficient Ops
Efficient Ops
May 19, 2020 · Cloud Native

Mastering Prometheus on Kubernetes: Practical Tips, Exporter Guide, and Capacity Planning

This article explores the history and principles of Prometheus monitoring, offers guidance on version selection, highlights its limitations, details common Kubernetes exporters, shows Grafana dashboard setups, and provides in‑depth strategies for exporter aggregation, golden metrics, multi‑cluster scraping, GPU monitoring, timezone handling, memory optimization, capacity planning, and rate calculations.

Monitoringgrafanakubernetes
0 likes · 19 min read
Mastering Prometheus on Kubernetes: Practical Tips, Exporter Guide, and Capacity Planning
dbaplus Community
dbaplus Community
May 12, 2020 · Cloud Native

Migrating Massive Big‑Data Services to Kubernetes: Lessons from Tongcheng‑eLong

This article details how Tongcheng‑eLong transitioned from Docker‑Host deployments to a Kubernetes‑based platform for hundreds of storage and compute services, covering network integration, IP management, service synchronization, storage strategies, operator development, monitoring, logging, and the challenges and future plans they encountered.

Big DataCloud NativeDocker
0 likes · 17 min read
Migrating Massive Big‑Data Services to Kubernetes: Lessons from Tongcheng‑eLong
MaGe Linux Operations
MaGe Linux Operations
May 10, 2020 · Databases

How to Build a Complete MySQL Monitoring Dashboard with Prometheus and Grafana

This guide walks through deploying mysqld_exporter, configuring Prometheus and Grafana, and monitoring essential MySQL metrics such as replication health, query throughput, slow‑query counts, connection usage, and InnoDB buffer‑pool statistics, while also showing how to set up alert rules for proactive database operations.

AlertingExportersMonitoring
0 likes · 15 min read
How to Build a Complete MySQL Monitoring Dashboard with Prometheus and Grafana
vivo Internet Technology
vivo Internet Technology
Apr 29, 2020 · Cloud Native

Prometheus Architecture and Design Principles: A Deep Dive into Cloud-Native Monitoring

Prometheus, a CNCF‑graduated, cloud‑native monitoring system, combines pull‑based target discovery, a label‑rich time‑series data model, and four core metric types—gauge, counter, histogram, and summary—to provide near‑real‑time visibility, short‑term retention, alerting via AlertManager, and integration with Grafana and remote storage for scalable observability.

AlertmanagerCNCFMonitoring
0 likes · 11 min read
Prometheus Architecture and Design Principles: A Deep Dive into Cloud-Native Monitoring
dbaplus Community
dbaplus Community
Apr 25, 2020 · Operations

Master Blackbox Exporter: Install, Configure, and Monitor with Prometheus

This guide explains the concepts of white‑box and black‑box monitoring, introduces Prometheus Blackbox Exporter, walks through installation, systemd setup, and detailed Prometheus configurations for HTTP, TCP, ICMP, POST and SSL checks, shows Grafana dashboard integration, and provides alert rule examples for reliable service health monitoring.

AlertingBlackbox ExporterHTTP
0 likes · 13 min read
Master Blackbox Exporter: Install, Configure, and Monitor with Prometheus
Cloud Native Technology Community
Cloud Native Technology Community
Apr 21, 2020 · Cloud Native

Deploying Thanos on Kubernetes: Architecture, Deployment Options, and Practical Guide

This article explains the Thanos architecture, compares Sidecar and Receiver deployment modes, walks through object‑storage configuration, and provides complete Kubernetes YAML examples for Prometheus, Thanos Sidecar, Query, Store Gateway, Ruler, Compact, and Receiver to build a large‑scale cloud‑native monitoring system.

Cloud NativeThanosdeployment
0 likes · 27 min read
Deploying Thanos on Kubernetes: Architecture, Deployment Options, and Practical Guide
Cloud Native Technology Community
Cloud Native Technology Community
Apr 8, 2020 · Operations

Decoding Thanos Architecture: From Query to Compact for Scalable Monitoring

This article provides a detailed analysis of Thanos' architecture, explaining each core component—Query, Sidecar, Store Gateway, Ruler, Compact, and the upcoming Receiver—how they enable global view, high availability, and long‑term storage for distributed Prometheus deployments, and discusses design trade‑offs and optimization strategies.

Cloud NativeLong‑term StorageMonitoring
0 likes · 12 min read
Decoding Thanos Architecture: From Query to Compact for Scalable Monitoring
UCloud Tech
UCloud Tech
Apr 8, 2020 · Cloud Native

Migrating Spring Cloud Microservices to UK8S for Scalable Cloud‑Native Operations

This article details how the Chinese travel platform “要出发” transformed its Spring Cloud‑based micro‑service architecture to a UK8S‑powered Kubernetes environment, introducing Spring Cloud Kubernetes discovery, Prometheus JVM monitoring, HPA‑driven autoscaling, Elastic APM tracing, and Istio service governance to achieve higher elasticity, observability, and operational efficiency.

IstioUK8Selastic apm
0 likes · 11 min read
Migrating Spring Cloud Microservices to UK8S for Scalable Cloud‑Native Operations
Efficient Ops
Efficient Ops
Apr 6, 2020 · Databases

How to Build a MySQL Monitoring Platform with Prometheus and Grafana

This article walks through setting up a production‑grade MySQL monitoring solution using Prometheus and Grafana, covering exporter installation, MySQL user configuration, systemd service setup, Prometheus job definition, key MySQL performance metrics, and basic alerting rules.

MetricsMonitoringgrafana
0 likes · 15 min read
How to Build a MySQL Monitoring Platform with Prometheus and Grafana
360 Quality & Efficiency
360 Quality & Efficiency
Apr 3, 2020 · Operations

Prometheus Monitoring System: Concepts, Architecture, and Hands‑On Deployment with Node Exporter and Grafana

This article introduces the core concepts and architecture of the open‑source Prometheus monitoring system, explains its data model and metric types, and provides a step‑by‑step guide to install a Prometheus server, collect host metrics with Node Exporter, and visualize them using Grafana.

MetricsMonitoringNode Exporter
0 likes · 10 min read
Prometheus Monitoring System: Concepts, Architecture, and Hands‑On Deployment with Node Exporter and Grafana
Cloud Native Technology Community
Cloud Native Technology Community
Mar 30, 2020 · Cloud Native

Building a Cloud‑Native Large‑Scale Distributed Monitoring System with Prometheus

This article explains how to design and implement a cloud‑native, large‑scale distributed monitoring system using Prometheus, covering its limitations, service‑level sharding, centralized storage, federation, and high‑availability strategies to overcome scaling challenges in Kubernetes environments.

Cloud NativeFederationHigh Availability
0 likes · 12 min read
Building a Cloud‑Native Large‑Scale Distributed Monitoring System with Prometheus
Efficient Ops
Efficient Ops
Mar 8, 2020 · Operations

Prometheus vs Zabbix: Install, Configure & Visualize with Grafana

This article compares Prometheus with Zabbix, walks through downloading and installing Prometheus, explains the key sections of prometheus.yml, shows how to add a node_exporter for machine metrics, and demonstrates integrating Grafana to create rich monitoring dashboards.

LinuxMonitoringZabbix
0 likes · 11 min read
Prometheus vs Zabbix: Install, Configure & Visualize with Grafana
dbaplus Community
dbaplus Community
Mar 2, 2020 · Operations

How Jiangsu Mobile Built a Billion‑Call Real‑Time Monitoring Platform with Prometheus

Facing the explosion of 5G traffic and billions of daily call records, Jiangsu Mobile’s IT operations team adopted Prometheus as the core time‑series database, designing a high‑availability, low‑latency monitoring platform that captures, stores, visualizes and predicts performance metrics across their massive billing system.

5GOperationsTime-series database
0 likes · 9 min read
How Jiangsu Mobile Built a Billion‑Call Real‑Time Monitoring Platform with Prometheus
Efficient Ops
Efficient Ops
Feb 24, 2020 · Operations

How to Build an Effective Operations Monitoring Platform: Tools, Design, and Best Practices

This article explains why monitoring is essential for operations, reviews popular monitoring tools such as Cacti, Nagios, Zabbix, Ganglia, Centreon, Prometheus and Grafana, outlines a six‑layer unified monitoring platform architecture, offers selection guidance for different enterprise sizes, and shares evolution lessons from small to large scale deployments.

OperationsZabbixdevops
0 likes · 20 min read
How to Build an Effective Operations Monitoring Platform: Tools, Design, and Best Practices
Programmer DD
Programmer DD
Feb 16, 2020 · Operations

How to Monitor Redis with Prometheus and Grafana: Step-by-Step Guide

Learn how to set up Prometheus and Grafana to monitor Redis instances by installing the redis_exporter plugin, configuring Prometheus scrape jobs, handling build issues, and visualizing metrics with ready-made Grafana dashboards, all illustrated with code snippets and configuration examples.

ConfigurationExportergrafana
0 likes · 4 min read
How to Monitor Redis with Prometheus and Grafana: Step-by-Step Guide
Programmer DD
Programmer DD
Feb 15, 2020 · Operations

Understanding Prometheus: Architecture, Data Model, and Alerting Explained

This article provides a comprehensive overview of Prometheus, covering its open‑source monitoring architecture, multi‑dimensional data model, query language, storage mechanisms, service discovery, alerting workflow with Alertmanager, and visualization using Grafana, all illustrated with key diagrams and configuration examples.

AlertingMetricsOps
0 likes · 9 min read
Understanding Prometheus: Architecture, Data Model, and Alerting Explained
Java High-Performance Architecture
Java High-Performance Architecture
Feb 10, 2020 · Backend Development

How to Monitor Spring Boot Apps with Prometheus and Grafana: Step‑by‑Step Guide

This tutorial walks through building a Spring Boot application, integrating Micrometer for metric collection, deploying Prometheus and Grafana via Docker, configuring dynamic service discovery, and creating custom request‑count metrics with AOP, providing a complete end‑to‑end monitoring solution.

DockerSpring Bootgrafana
0 likes · 15 min read
How to Monitor Spring Boot Apps with Prometheus and Grafana: Step‑by‑Step Guide
360 Tech Engineering
360 Tech Engineering
Jan 7, 2020 · Operations

Introduction to Prometheus and Grafana for Monitoring and Alerting

This article provides a comprehensive overview of using Prometheus and Grafana for metric collection, storage, querying with PromQL, visualization, and alerting, including exporter integration, metric types, high‑availability setups, and practical examples for modern microservice architectures.

MetricsMonitoringgrafana
0 likes · 10 min read
Introduction to Prometheus and Grafana for Monitoring and Alerting
Aikesheng Open Source Community
Aikesheng Open Source Community
Dec 25, 2019 · Operations

Deploying Thanos for Unified Prometheus Monitoring and Long‑Term Storage

This guide explains the background, key features, architecture, and step‑by‑step deployment of Thanos—including Sidecar, Store, Query, Compact, Bucket, Rule, and Check components—to provide a unified, high‑availability Prometheus monitoring view with unlimited historical data storage using object storage.

Cloud NativeLong‑term StorageMonitoring
0 likes · 9 min read
Deploying Thanos for Unified Prometheus Monitoring and Long‑Term Storage
Efficient Ops
Efficient Ops
Dec 24, 2019 · Operations

Scaling Real‑Time Monitoring for Billion‑Call Billing with Prometheus

Jiangsu Mobile’s IT operations team partnered with Newland to build a high‑availability, real‑time performance management platform using Prometheus, achieving billion‑level call‑record monitoring, low‑latency queries, data compression, and advanced forecasting, dramatically improving system health visibility and operational efficiency.

performance managementprometheustime_series_database
0 likes · 10 min read
Scaling Real‑Time Monitoring for Billion‑Call Billing with Prometheus
Huajiao Technology
Huajiao Technology
Dec 17, 2019 · Backend Development

Diagnosing Java Memory Leaks: JVM GC Roots, Monitoring with Spring Boot Actuator, Prometheus, Grafana, and MAT

This article explains how Java memory leaks can occur despite automatic garbage collection, describes JVM reachability analysis, shows how to monitor and detect leaks using Spring Boot Actuator, Prometheus, and Grafana, and provides step‑by‑step instructions for heap dump analysis and code fixes.

Garbage CollectionJVMJava
0 likes · 11 min read
Diagnosing Java Memory Leaks: JVM GC Roots, Monitoring with Spring Boot Actuator, Prometheus, Grafana, and MAT
Alibaba Cloud Native
Alibaba Cloud Native
Nov 30, 2019 · Cloud Native

How Alibaba Cloud Manages Over 10,000 Kubernetes Clusters at Double‑11 Scale

This article explains how Alibaba Cloud Container Service (ACK) designs a unit‑based, tiered management system, capacity planning model, global observability architecture, and pluggable components to reliably operate more than ten thousand diverse Kubernetes clusters during the massive Double‑11 shopping event.

ACKAlibaba CloudObservability
0 likes · 13 min read
How Alibaba Cloud Manages Over 10,000 Kubernetes Clusters at Double‑11 Scale
MaGe Linux Operations
MaGe Linux Operations
Nov 26, 2019 · Operations

Master Prometheus: From Basics to Advanced Configuration and Alerts

This article introduces Prometheus, an open‑source monitoring system, explains its core components such as server, exporters, and Alertmanager, provides step‑by‑step installation and configuration instructions, demonstrates alert rule setup, and shows integration with tools like Grafana, Telegraf, Spring Boot and Canal.

AlertmanagerMonitoringdevops
0 likes · 10 min read
Master Prometheus: From Basics to Advanced Configuration and Alerts
Alibaba Cloud Native
Alibaba Cloud Native
Nov 18, 2019 · Cloud Native

How Kubernetes Monitoring Evolved: From Heapster to Metrics‑Server and Prometheus

This article explains the fundamentals of monitoring and logging in large‑scale Kubernetes clusters, classifies monitoring types, traces the evolution from Heapster to the lightweight metrics‑server, outlines the three Kubernetes monitoring APIs, reviews Prometheus as the de‑facto standard, and describes Alibaba Cloud’s enhanced monitoring and logging solutions.

LoggingMetrics Serverkubernetes
0 likes · 24 min read
How Kubernetes Monitoring Evolved: From Heapster to Metrics‑Server and Prometheus
Alibaba Cloud Native
Alibaba Cloud Native
Nov 14, 2019 · Cloud Native

What’s New in Cloud Native: Helm 3, Kubernetes 1.17, Istio Updates and More

This roundup highlights the latest cloud‑native announcements, including Helm 3’s stable release, the GitHub Octoverse language trends, upcoming KubeCon North America, CNCF’s Prometheus report, Kubernetes 1.17 code freeze, key upstream feature improvements, and a curated list of open‑source projects and reading recommendations.

helmkubernetesopen source
0 likes · 9 min read
What’s New in Cloud Native: Helm 3, Kubernetes 1.17, Istio Updates and More
dbaplus Community
dbaplus Community
Oct 28, 2019 · Operations

Avoid Common Prometheus Pitfalls: Best Practices for Reliable Monitoring

This article shares practical Prometheus best‑practice tips, covering the accuracy‑reliability trade‑off, self‑monitoring setups, avoiding NFS storage, pruning high‑cardinality metrics, handling rate‑function traps, alert‑graph mismatches, group_interval effects, and the overarching goal of stable, cost‑effective observability.

AlertingOperationsbest practices
0 likes · 9 min read
Avoid Common Prometheus Pitfalls: Best Practices for Reliable Monitoring
Efficient Ops
Efficient Ops
Oct 22, 2019 · Operations

How Modern IT Monitoring Systems Keep Your Services Running Smoothly

This article explains the purpose, core functions, classification, layered architecture, and popular implementations of IT monitoring systems, covering log‑based, trace‑based, and metric‑based approaches as well as a comparison of Zabbix and Prometheus.

IT monitoringObservabilityZabbix
0 likes · 17 min read
How Modern IT Monitoring Systems Keep Your Services Running Smoothly
Programmer DD
Programmer DD
Sep 20, 2019 · Operations

Master Prometheus: Key Features, Architecture, and Query Essentials

This article introduces Prometheus, an open‑source cloud‑native monitoring and alerting system, covering its main characteristics, core components, architecture diagram, typical use cases, query language syntax, built‑in functions, time‑series types, and practical tips for reliable operation.

AlertingMonitoringOperations
0 likes · 9 min read
Master Prometheus: Key Features, Architecture, and Query Essentials
dbaplus Community
dbaplus Community
Sep 16, 2019 · Operations

How to Build Effective Monitoring for Microservices: Logs, Tracing, and Metrics Explained

This article explains the three main monitoring approaches—log collection, distributed tracing, and metric gathering—in microservice architectures, outlines the layered monitoring model, lists key system, application, and user metrics, and reviews popular open‑source time‑series monitoring tools such as Prometheus, OpenTSDB, and InfluxDB.

MetricsMicroservicesMonitoring
0 likes · 10 min read
How to Build Effective Monitoring for Microservices: Logs, Tracing, and Metrics Explained
DevOps Cloud Academy
DevOps Cloud Academy
Sep 5, 2019 · Operations

An Overview of the Prometheus Monitoring System

Prometheus, an open‑source monitoring and alerting toolkit originally developed by SoundCloud and now a CNCF project, offers multidimensional data models, flexible queries, pull‑based data collection, various metric types (counter, gauge, summary, histogram), local and remote storage, service discovery, and integrates with Grafana for visualization.

Cloud NativeMetricsMonitoring
0 likes · 8 min read
An Overview of the Prometheus Monitoring System
Programmer DD
Programmer DD
Aug 13, 2019 · Operations

Mastering Prometheus Histograms: How Cumulative Buckets Simplify Metrics

This article explains the fundamentals of Prometheus histogram metrics, illustrates why they are cumulative, shows how to drop unwanted buckets with relabeling, and demonstrates quantile calculations using the histogram_quantile function, providing practical examples and code snippets for effective monitoring.

HistogramMetricsMonitoring
0 likes · 7 min read
Mastering Prometheus Histograms: How Cumulative Buckets Simplify Metrics
dbaplus Community
dbaplus Community
Jul 23, 2019 · Cloud Native

How Xiaomi Scaled Kubernetes Monitoring with Prometheus and Open‑Falcon

This article details Xiaomi's Ocean elastic scheduling platform's challenges in monitoring massive Kubernetes clusters, the transition from Open‑Falcon to a Prometheus‑based solution with remote storage, partitioned deployment strategies, performance testing, and future plans for automated scaling and data analytics.

Cloud NativeRemote Storagekubernetes
0 likes · 16 min read
How Xiaomi Scaled Kubernetes Monitoring with Prometheus and Open‑Falcon
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Jul 18, 2019 · Operations

Why Bosun Beats Alertmanager and Kapacitor for Container Alerting

This article compares three container alerting frameworks—Alertmanager, Kapacitor, and Bosun—explains why Bosun was chosen for its flexible HTTP API rule deployment and low learning curve, and provides step‑by‑step configuration, rule definition, notification, and templating examples for integrating Bosun with Prometheus.

AlertingBosunConfiguration
0 likes · 9 min read
Why Bosun Beats Alertmanager and Kapacitor for Container Alerting
dbaplus Community
dbaplus Community
Jul 17, 2019 · Databases

Rethinking Prometheus TSDB: From V2 Bottlenecks to the Scalable V3 Design

This article examines the limitations of Prometheus's original V2 time‑series storage, proposes a block‑oriented V3 architecture that tackles series churn, write amplification, and indexing inefficiencies, and validates the new design with extensive benchmarks showing dramatic reductions in memory, CPU, and disk usage.

IndexingTSDBkubernetes
0 likes · 36 min read
Rethinking Prometheus TSDB: From V2 Bottlenecks to the Scalable V3 Design
ITPUB
ITPUB
Jun 21, 2019 · Cloud Native

Building a Scalable, High‑Availability Kubernetes Monitoring System with Prometheus

This article details the design and evolution of a highly available, persistent, and dynamically adjustable Kubernetes monitoring solution at Xiaomi, covering initial Falcon‑based approaches, the transition to Prometheus with remote storage via OpenTSDB, federation‑based partitioning, deployment strategies, performance testing, and future enhancements.

Cloud NativeFALCONOpenTSDB
0 likes · 17 min read
Building a Scalable, High‑Availability Kubernetes Monitoring System with Prometheus
DevOps Cloud Academy
DevOps Cloud Academy
Jun 20, 2019 · Operations

Step-by-Step Installation and Configuration of Node Exporter, Alertmanager, Prometheus, and Grafana for Monitoring and Alerting

This guide walks through downloading, extracting, and setting up Node Exporter, Alertmanager, Prometheus, and Grafana on a Linux server, configuring their systemd services, customizing alert rules, and verifying the monitoring and alerting pipeline with screenshots of each verification step.

AlertmanagerMonitoringNode Exporter
0 likes · 7 min read
Step-by-Step Installation and Configuration of Node Exporter, Alertmanager, Prometheus, and Grafana for Monitoring and Alerting
DevOps Cloud Academy
DevOps Cloud Academy
Jun 9, 2019 · Operations

Prometheus Metric Definitions, Types, and Data Samples

This article explains Prometheus metric naming conventions, label usage, metric types such as Counter, Gauge, Summary, and Histogram, and describes the structure of data samples, providing examples and best‑practice guidelines for defining and classifying metrics in monitoring systems.

MetricsMonitoringObservability
0 likes · 5 min read
Prometheus Metric Definitions, Types, and Data Samples
dbaplus Community
dbaplus Community
Apr 24, 2019 · Operations

Choosing and Tuning Open‑Source Monitoring Stacks for Large‑Scale Operations

This article reviews common open‑source monitoring tools, shares the evolution of China Unicom's big‑data platform monitoring, and provides practical guidance on selecting collectors, databases, and visualization components, with detailed configurations for Prometheus, Alertmanager, Grafana, and automation recovery techniques.

AlertmanagerInfluxDBMonitoring
0 likes · 19 min read
Choosing and Tuning Open‑Source Monitoring Stacks for Large‑Scale Operations
58 Tech
58 Tech
Apr 19, 2019 · Operations

Prometheus-Based Monitoring Solution for the 58 Cloud Search Platform

This article describes the challenges of scaling the 58 Cloud Search service, explains why Prometheus was selected as the monitoring stack, and details the architecture, data collection, storage, alerting, visualization, and future enhancements of the resulting cloud‑native monitoring system.

AlertmanagerCloud Nativegrafana
0 likes · 12 min read
Prometheus-Based Monitoring Solution for the 58 Cloud Search Platform
Efficient Ops
Efficient Ops
Apr 18, 2019 · Operations

Choosing the Right Monitoring Stack: From Nagios to Prometheus & Grafana

This article reviews common open‑source monitoring combinations, compares their strengths and weaknesses, and shares practical guidance on selecting collectors, storage back‑ends, and visualization tools such as Telegraf, InfluxDB, Prometheus, Grafana, and alertmanager for large‑scale data platform operations.

InfluxDBMonitoringOperations
0 likes · 12 min read
Choosing the Right Monitoring Stack: From Nagios to Prometheus & Grafana
Programmer DD
Programmer DD
Jan 24, 2019 · Cloud Native

What’s New in Nacos 0.8.0? Key Features, Installation & First‑Run Guide

The article introduces Nacos 0.8.0, highlighting its three major production features—user login, Prometheus metrics, and namespace isolation—while providing step‑by‑step download links, startup commands for Linux and Windows, and instructions to access the default login console.

Cloud Nativeinstallation guideprometheus
0 likes · 4 min read
What’s New in Nacos 0.8.0? Key Features, Installation & First‑Run Guide
360 Tech Engineering
360 Tech Engineering
Dec 18, 2018 · Cloud Native

Design and Implementation of 360 Container Platform Monitoring System

The article describes how 360 built a Kubernetes‑based container platform monitoring system using Prometheus, ELK, Grafana and custom components, detailing its architecture, monitoring dimensions, log collection, alerting, selection rationale, high‑availability design, and future evolution for scalable cloud‑native operations.

Monitoringcontainer platformkubernetes
0 likes · 12 min read
Design and Implementation of 360 Container Platform Monitoring System
Liulishuo Tech Team
Liulishuo Tech Team
Dec 14, 2018 · Mobile Development

Engineering Practice: Building an Android Application Performance Management (APM) Dashboard

This article details the architectural design and engineering practices behind building a comprehensive Application Performance Management dashboard for Android applications, covering real-time monitoring, version comparison, development cycle tracking, automated data collection, and integrated test coverage analysis to ensure sustainable software quality and delivery efficiency.

APMAndroid DevelopmentCI/CD
0 likes · 21 min read
Engineering Practice: Building an Android Application Performance Management (APM) Dashboard
Efficient Ops
Efficient Ops
Jun 11, 2018 · Operations

How to Build Low-Cost Automated Operations with Prometheus, Ansible, and Jenkins

This guide walks small teams through step‑by‑step implementation of low‑cost automated operations, covering basic monitoring with Prometheus, configuration versioning via Ansible, CI/CD pipelines using Jenkins, and scaling practices, enabling gradual evolution toward enterprise‑grade DevOps architectures.

AnsibleCI/CDJenkins
0 likes · 12 min read
How to Build Low-Cost Automated Operations with Prometheus, Ansible, and Jenkins
UCloud Tech
UCloud Tech
Nov 22, 2017 · Backend Development

Master Go Microservices: gRPC, TLS, Tracing & Prometheus Monitoring

This article shares practical Go microservice building experiences, covering gRPC-based communication, TLS security, request tracing, and comprehensive monitoring with Prometheus, including metric selection, alerting, and log management using Logrus and Graylog, to help reduce coupling and improve system observability.

LoggingMicroservicesMonitoring
0 likes · 10 min read
Master Go Microservices: gRPC, TLS, Tracing & Prometheus Monitoring
dbaplus Community
dbaplus Community
Nov 19, 2017 · Operations

Designing Scalable Monitoring with ELK and GPE: A Practical Guide

This article outlines a large‑scale monitoring solution for distributed microservice environments, comparing traditional ELK logging with a custom GPE stack (Grafana, Prometheus, Exporter, Consul), detailing architecture, components, workflows, and practical considerations for reliable observability.

ELKMonitoringgrafana
0 likes · 10 min read
Designing Scalable Monitoring with ELK and GPE: A Practical Guide
Programmer DD
Programmer DD
Sep 18, 2017 · Operations

Mastering Prometheus: From Metrics Collection to Alerting and Visualization

This guide explains how to choose between push and pull monitoring models, introduces Prometheus architecture and metric syntax, shows Node.js client integration with code examples, and covers Alertmanager features and Grafana visualization for effective application monitoring.

AlertmanagerMetricsMonitoring
0 likes · 8 min read
Mastering Prometheus: From Metrics Collection to Alerting and Visualization
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Aug 30, 2017 · Operations

Mastering Prometheus: From Metrics Basics to High‑Availability Monitoring

This article shares practical experiences of using Prometheus for monitoring complex services, covering metric types, PromQL query techniques, naming conventions, service discovery with file‑based configs, high‑availability sharding, alerting via Alertmanager, and visualisation with Grafana, providing actionable guidance for reliable observability.

MonitoringPromQLgrafana
0 likes · 15 min read
Mastering Prometheus: From Metrics Basics to High‑Availability Monitoring
DevOps
DevOps
Jul 12, 2017 · Cloud Native

Container Monitoring: Challenges, Metrics Collection, and Best Practices

This article examines the unique challenges of monitoring containers, outlines three categories of metrics to collect, compares host‑centric and layered monitoring architectures, provides detailed methods for gathering CPU, memory, I/O and network data via cgroup files and Docker commands, and shares practical insights, tooling recommendations, and a Q&A session for effective container observability.

DockerMonitoringOps
0 likes · 18 min read
Container Monitoring: Challenges, Metrics Collection, and Best Practices
Efficient Ops
Efficient Ops
Jun 11, 2017 · Operations

How Bilibili Scaled Its Ops: From DIY Deployments to Prometheus Monitoring

From early manual deployments to a sophisticated, multi-layered monitoring stack—including ELK, Zabbix, Statsd, Grafana, and Prometheus—Bilibili’s ops team shares the evolution, challenges, and lessons learned in building scalable, automated infrastructure for massive internet traffic.

ELKMonitoringOperations
0 likes · 8 min read
How Bilibili Scaled Its Ops: From DIY Deployments to Prometheus Monitoring
dbaplus Community
dbaplus Community
Jun 5, 2017 · Cloud Native

How to Tackle Performance Optimization in Large‑Scale Kubernetes PaaS Platforms

This article examines the daunting performance‑optimization challenges of a complex PaaS architecture, breaks the system into control, data, and monitoring subsystems, defines concrete metrics, demonstrates testing with Prometheus and other tools, and shares practical automation techniques to accelerate iterative improvements.

Cloud NativePaaSkubernetes
0 likes · 16 min read
How to Tackle Performance Optimization in Large‑Scale Kubernetes PaaS Platforms
dbaplus Community
dbaplus Community
Aug 19, 2016 · Operations

Unlocking System Reliability: The Value and Complete Architecture of Monitoring for Containers

This article explains why monitoring is essential for system reliability, outlines the key components of a comprehensive monitoring framework, compares data collection methods, and presents practical container monitoring solutions—from Docker stats to cAdvisor with InfluxDB and Grafana, as well as Kubernetes and Mesos integrations.

cAdvisorgrafanakubernetes
0 likes · 14 min read
Unlocking System Reliability: The Value and Complete Architecture of Monitoring for Containers