Topic

monitoring

Collection size
1674 articles
Page 5 of 84
DevOps
DevOps
Oct 26, 2023 · Operations

Design and Implementation of SLA for Object Storage Services

This article explains how to design SLA metrics for object storage services, describes the S3 protocol, proposes availability calculations, outlines monitoring and alerting rules, and provides practical implementation examples using s3cmd, Python boto, and Java SDK to ensure reliable cloud storage operations.

DevOpsSLAcloud
0 likes · 16 min read
Design and Implementation of SLA for Object Storage Services
Practical DevOps Architecture
Practical DevOps Architecture
Mar 5, 2025 · Operations

Zabbix Agent Active Mode Workflow and Configuration Guide

This article explains the Zabbix‑Agent active mode workflow, detailing how the agent initiates TCP connections to the Zabbix‑Server to request monitoring items, receives the item list, sends collected data back, and provides step‑by‑step configuration of the agent and server, including template cloning and essential parameters.

Agent ConfigurationLinuxZabbix
0 likes · 6 min read
Zabbix Agent Active Mode Workflow and Configuration Guide
Practical DevOps Architecture
Practical DevOps Architecture
Oct 30, 2024 · Operations

Bash Scripts for Real‑Time Network Traffic and Disk Usage Monitoring

This article provides two Bash scripts: one that continuously displays inbound and outbound traffic of a specified network interface, and another that remotely checks disk usage on up to 100 servers, issuing warnings when partitions exceed a defined threshold.

Linuxbashdisk
0 likes · 3 min read
Bash Scripts for Real‑Time Network Traffic and Disk Usage Monitoring
Practical DevOps Architecture
Practical DevOps Architecture
Jun 13, 2024 · Operations

Comprehensive Data Center Operations Training Course Overview

This extensive training program covers everything a data center operations engineer needs—from foundational infrastructure management and server hardware maintenance to advanced network configuration, security hardening, monitoring, fault handling, and practical hands‑on skills for real‑world challenges.

InfrastructureServer managementdata center
0 likes · 6 min read
Comprehensive Data Center Operations Training Course Overview
Practical DevOps Architecture
Practical DevOps Architecture
May 9, 2024 · Operations

Monitoring SSL Certificate Expiration with Zabbix Using a Shell Script

This guide explains how to create a shell script that checks SSL certificate expiration dates and integrates it with Zabbix by configuring a user parameter, testing the script, and setting up monitoring items, triggers, graphs, and alerts to ensure services remain available.

SSLShell scriptZabbix
0 likes · 3 min read
Monitoring SSL Certificate Expiration with Zabbix Using a Shell Script
Practical DevOps Architecture
Practical DevOps Architecture
Sep 30, 2022 · Operations

Resolving Filebeat Startup Failure: EOF Error in Registrar State

This guide explains how to troubleshoot Filebeat failing to start due to an EOF error while loading registrar state, by inspecting logs, resetting the registry directory, and restarting the service on a Linux host.

FilebeatLinuxLogstash
0 likes · 4 min read
Resolving Filebeat Startup Failure: EOF Error in Registrar State
DevOps Cloud Academy
DevOps Cloud Academy
May 31, 2024 · Cloud Native

Optimizing RabbitMQ Performance on Kubernetes

This guide explains how to deploy RabbitMQ on Kubernetes and improve its performance through Helm installation, resource tuning, monitoring, scaling, security hardening, and advanced configuration techniques, providing practical code examples for each step.

Performance OptimizationRabbitMQhelm
0 likes · 9 min read
Optimizing RabbitMQ Performance on Kubernetes
DevOps Cloud Academy
DevOps Cloud Academy
Feb 27, 2023 · Operations

Understanding GitOps: History, Principles, Benefits, and Practical Implementation

This article explains the origins of GitOps, defines its core principles of declarative infrastructure, versioned desired state, automated approval, and compliance monitoring, and outlines its benefits and a concrete practice using tools such as GitLab, ArgoCD, Kubernetes, Terraform, Prometheus, and Grafana.

GitOpsTerraformci/cd
0 likes · 18 min read
Understanding GitOps: History, Principles, Benefits, and Practical Implementation
DevOps Cloud Academy
DevOps Cloud Academy
Mar 2, 2022 · Operations

Promoter: Rendering AlertManager Graphs for DingTalk Notifications Using Go

The article introduces Promoter, a Go‑based webhook that fetches Prometheus metrics, renders alert graphs with gonum/plot, stores the images in S3‑compatible object storage, and embeds them in DingTalk notifications, providing deployment instructions, template customization, and core implementation details.

AlertmanagerDingTalkPrometheus
0 likes · 10 min read
Promoter: Rendering AlertManager Graphs for DingTalk Notifications Using Go
DevOps Cloud Academy
DevOps Cloud Academy
Jan 25, 2021 · Cloud Native

Blackbox Monitoring with Prometheus Blackbox Exporter in Kubernetes

This guide explains how to complement Prometheus white‑box monitoring with black‑box probes by deploying the Blackbox Exporter in a Kubernetes cluster, configuring ConfigMaps, Deployments, Services, and Prometheus scrape jobs for HTTP, DNS, TCP, and ICMP checks, and using annotations for automatic service discovery.

Blackbox ExporterObservabilityPrometheus
0 likes · 10 min read
Blackbox Monitoring with Prometheus Blackbox Exporter in Kubernetes
Top Architect
Top Architect
Dec 5, 2024 · Databases

Database Monitoring and Slow Query Log Management Guide

This article explains how database administrators can monitor system resource usage with commands like top, iostat, and vmstat, and configure MySQL slow query logging, including enabling the log, setting thresholds, viewing logs, and applying best‑practice recommendations for analysis and issue resolution.

Database AdministrationLinux CommandsMySQL
0 likes · 8 min read
Database Monitoring and Slow Query Log Management Guide
Top Architect
Top Architect
May 5, 2023 · Backend Development

Using Redis Sentinel for High Availability: Design and Implementation

This article introduces Redis Sentinel as the official high‑availability solution for Redis, explains its core functions, provides configuration examples, compares three ways to receive failover notifications (script, client subscription, and indirect service), and offers design recommendations for robust production deployments.

DevOpsFailoverHigh Availability
0 likes · 10 min read
Using Redis Sentinel for High Availability: Design and Implementation
DataFunSummit
DataFunSummit
Mar 22, 2024 · Artificial Intelligence

Risk Control Model Construction for Online Small Loans: Pre‑loan, In‑loan, Post‑loan and Monitoring

This article presents a comprehensive overview of risk control model building for online small‑loan scenarios, covering pre‑loan, in‑loan and post‑loan stages, the associated data pipelines, model deployment strategies, optimization attempts, and monitoring frameworks to ensure accuracy, stability and effectiveness.

credit scoringdata pipelineloan management
0 likes · 16 min read
Risk Control Model Construction for Online Small Loans: Pre‑loan, In‑loan, Post‑loan and Monitoring
Deepin Linux
Deepin Linux
Feb 12, 2025 · Operations

Comprehensive Guide to Linux Server Fault Diagnosis and Troubleshooting

This article provides a detailed overview of common Linux server failures, a step‑by‑step methodology for fault isolation, practical monitoring tools and commands, and a real‑world case study illustrating diagnosis and remediation techniques for production environments.

LinuxPerformanceSysadmin
0 likes · 26 min read
Comprehensive Guide to Linux Server Fault Diagnosis and Troubleshooting
Deepin Linux
Deepin Linux
Feb 26, 2024 · Operations

Linux System Performance Metrics and Monitoring Tools

This article explains the key Linux performance indicators—CPU, memory, disk I/O, file system, and network—describes how to monitor them with commands like top, vmstat, iostat, iotop, and smem, and provides practical guidance on interpreting the results to identify and resolve system bottlenecks.

CPULinuxPerformance
0 likes · 70 min read
Linux System Performance Metrics and Monitoring Tools
Java Captain
Java Captain
Feb 21, 2021 · Operations

Exposing Spring Boot Metrics with Prometheus and Visualizing Them in Grafana

This tutorial explains how to add Actuator and Prometheus dependencies to a Spring Boot application, configure security, expose metrics endpoints, run Prometheus and Grafana via Docker, and set up Grafana dashboards for real‑time monitoring of Spring Boot services.

ActuatorDockerGrafana
0 likes · 4 min read
Exposing Spring Boot Metrics with Prometheus and Visualizing Them in Grafana
php中文网 Courses
php中文网 Courses
Sep 27, 2024 · Backend Development

Developing Real-Time Monitoring Applications with PHP and WebSocket

This article explains how to build real-time monitoring applications using PHP and the WebSocket protocol, covering the fundamentals of WebSocket, setting up a Ratchet server, creating client-side JavaScript connections, and providing complete code examples such as a stock price monitor.

JavaScriptRatchetWebSocket
0 likes · 7 min read
Developing Real-Time Monitoring Applications with PHP and WebSocket
FunTester
FunTester
Jun 5, 2025 · Cloud Native

Automating Thread Dump Generation and Retrieval in Kubernetes for Efficient Fault Diagnosis

The article explains how automating thread dump creation and download in Kubernetes using tools like Fabric8, Prometheus, and CI/CD pipelines dramatically improves fault‑diagnosis speed, data centralization, real‑time capture, and integration with testing frameworks, transforming manual, error‑prone processes into streamlined, intelligent operations.

Thread Dumpautomationci/cd
0 likes · 6 min read
Automating Thread Dump Generation and Retrieval in Kubernetes for Efficient Fault Diagnosis
Lobster Programming
Lobster Programming
Mar 10, 2025 · Operations

How to Build a Complete SpringBoot Monitoring System with Prometheus and Grafana

This guide walks you through integrating SpringBoot with Prometheus and Grafana, covering dependency setup, YAML configuration, a test controller, Prometheus scrape jobs, and Grafana dashboard creation to achieve real‑time application monitoring and performance analysis.

ActuatorGrafanaPrometheus
0 likes · 7 min read
How to Build a Complete SpringBoot Monitoring System with Prometheus and Grafana
Architecture Development Notes
Architecture Development Notes
Mar 18, 2024 · Operations

Designing an Operations Platform: Architecture, Core Components, and Extensions

This article explains how an operations platform can automate and streamline IT management by detailing its core value, essential components such as CMDB, monitoring, automation tools, ticketing, and analytics, and outlining implementation steps, technology choices, and advanced extensions like AI and DevOps integration.

CMDBDevOpsPlatform Architecture
0 likes · 7 min read
Designing an Operations Platform: Architecture, Core Components, and Extensions