Tagged articles
29 articles
Page 1 of 1
Ops Community
Ops Community
Apr 18, 2026 · Operations

Master Linux Host Monitoring: Prometheus, Node Exporter, Thresholds & Scripts

This comprehensive guide walks you through building a robust Linux host monitoring system with Prometheus and node_exporter, covering CPU, memory, disk, and network metrics, practical threshold formulas, ready‑to‑run Bash scripts, Alertmanager rules, Grafana dashboards, and best‑practice recommendations for reliable operations.

AlertmanagerGrafanaLinux monitoring
0 likes · 49 min read
Master Linux Host Monitoring: Prometheus, Node Exporter, Thresholds & Scripts
MaGe Linux Operations
MaGe Linux Operations
Aug 16, 2025 · Operations

How to Build a Real-Time Linux Performance Alert System

Discover why conventional monitoring often fails and learn to construct a robust, three‑layer Linux performance alert system using Prometheus, Grafana, and Alertmanager, with detailed metric definitions, smart alert rules, visual dashboards, predictive capacity planning, automation scripts, and best‑practice guidelines for reliable operations.

GrafanaLinux monitoring
0 likes · 13 min read
How to Build a Real-Time Linux Performance Alert System
Selected Java Interview Questions
Selected Java Interview Questions
May 30, 2025 · Operations

Batch Installation of Node Exporter on Linux Hosts Using Ansible, JumpServer, and a Static File Server

This guide explains three practical methods for deploying the Prometheus node_exporter collector across large numbers of Linux servers—using a JumpServer with Ansible, a standalone Ansible playbook, or a custom Bash script combined with an internal static file server—complete with configuration, service setup, and integration into Consul and vmagent monitoring.

AnsibleConsulLinux monitoring
0 likes · 10 min read
Batch Installation of Node Exporter on Linux Hosts Using Ansible, JumpServer, and a Static File Server
JavaEdge
JavaEdge
Apr 2, 2023 · Operations

How Big Tech Analyzes System Performance: The Proven RESAR 7‑Step Method

The article presents the RESAR seven‑step performance analysis method used by large tech companies, detailing how to build a performance‑analysis decision tree, collect and correlate system counters, and combine global and targeted monitoring to uncover bottleneck evidence chains with concrete Linux commands and diagrams.

CPU profilingLinux monitoringRESAR Method
0 likes · 17 min read
How Big Tech Analyzes System Performance: The Proven RESAR 7‑Step Method
MaGe Linux Operations
MaGe Linux Operations
Dec 22, 2022 · Operations

Essential System Performance Metrics and Linux Server Monitoring Guide

This article explains key system performance testing metrics such as response time, concurrency, click‑through rate, throughput, TPS/QPS, PV/UV, and details essential Linux server indicators like CPU usage, memory utilization, load average, and disk I/O, providing formulas, interpretation guidelines, and useful command‑line tools.

CPU usageLinux monitoringLoad Average
0 likes · 17 min read
Essential System Performance Metrics and Linux Server Monitoring Guide
Aikesheng Open Source Community
Aikesheng Open Source Community
May 25, 2022 · Operations

Diagnosing High CPU Load Caused by Frequent Short‑Lived Processes in a MongoDB Environment Using execsnoop

The article describes how a MongoDB test environment on a single VM experienced persistent high CPU load despite low visible QPS, how the root cause was traced to thousands of short‑lived processes spawned by Zabbix monitoring, and how execsnoop was used to identify and eliminate the offending processes.

CPU loadLinux monitoringMongoDB
0 likes · 6 min read
Diagnosing High CPU Load Caused by Frequent Short‑Lived Processes in a MongoDB Environment Using execsnoop
MaGe Linux Operations
MaGe Linux Operations
Feb 3, 2022 · Operations

Master Linux Process Monitoring with htop: Install, Features, and Usage

This guide explains what htop is, highlights its user‑friendly features over the traditional top command, provides step‑by‑step installation instructions for major Linux distributions and from source, and shows how to navigate its interface to monitor and manage processes efficiently.

InstallationLinux monitoringSystem Administration
0 likes · 6 min read
Master Linux Process Monitoring with htop: Install, Features, and Usage
dbaplus Community
dbaplus Community
Oct 11, 2020 · Operations

Mastering CPU and Load: A Practical Guide to Linux Performance Troubleshooting

This article explains how to monitor and interpret CPU usage and load average on Linux servers, details the calculations behind these metrics, illustrates their meaning with examples and images, and provides step‑by‑step troubleshooting methods for high load, high CPU, and high load with low CPU scenarios.

CPULinux monitoringLoad Average
0 likes · 19 min read
Mastering CPU and Load: A Practical Guide to Linux Performance Troubleshooting
MaGe Linux Operations
MaGe Linux Operations
Jan 16, 2020 · Operations

How to Quickly Diagnose and Fix High CPU Usage in a Data Platform

This guide walks through a real‑world incident where a data platform’s CPU spiked to 98.94%, showing step‑by‑step how to identify the high‑load process, pinpoint the offending Java thread, analyze the root cause in the time‑utility code, and implement a performance‑focused solution that reduced load by thirtyfold.

CPU troubleshootingJava profilingLinux monitoring
0 likes · 7 min read
How to Quickly Diagnose and Fix High CPU Usage in a Data Platform
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 2, 2019 · Backend Development

Master Java Performance: Proven Strategies to Identify and Fix CPU, Memory, and I/O Bottlenecks

This article presents a comprehensive guide to Java performance optimization, covering common code pitfalls, CPU and memory analysis techniques, disk and network I/O troubleshooting, and a collection of essential Linux commands, enabling engineers to pinpoint and resolve critical bottlenecks efficiently.

CPU optimizationJava performanceLinux monitoring
0 likes · 24 min read
Master Java Performance: Proven Strategies to Identify and Fix CPU, Memory, and I/O Bottlenecks
FunTester
FunTester
Jul 15, 2019 · Operations

Installing and Localizing Netdata: A Real‑Time Linux Performance Monitoring Tool

This article explains how to install Netdata, a web‑based real‑time Linux performance monitoring tool, and provides a step‑by‑step guide to applying a Chinese localization script, including required dependencies, installation commands, and an overview of its key monitoring capabilities.

InstallationLinux monitoringNetdata
0 likes · 5 min read
Installing and Localizing Netdata: A Real‑Time Linux Performance Monitoring Tool
MaGe Linux Operations
MaGe Linux Operations
Sep 18, 2018 · Operations

Essential Bash Scripting Tips & Practical Linux Monitoring Scripts

This guide presents essential Bash scripting best practices and a collection of practical Linux monitoring scripts, covering topics such as random string generation, colored output functions, batch user creation, package and service checks, host ping, CPU/memory/disk utilization monitoring, remote disk checks, and website availability testing.

BashLinux monitoringShell scripting
0 likes · 5 min read
Essential Bash Scripting Tips & Practical Linux Monitoring Scripts
Hujiang Technology
Hujiang Technology
Dec 1, 2017 · Operations

Practical Guide to Performance Testing and Troubleshooting in Linux Environments

This article outlines a comprehensive, step‑by‑step approach to performance testing and root‑cause analysis for backend services, covering environment validation, tool selection, Linux system limits, dependency checks, empty‑endpoint verification, throughput calculation, log monitoring, and essential Linux commands such as netstat, vmstat, mpstat, iostat, top and free.

JMeterLinux monitoringPerformance Testing
0 likes · 20 min read
Practical Guide to Performance Testing and Troubleshooting in Linux Environments
MaGe Linux Operations
MaGe Linux Operations
Jun 24, 2017 · Operations

Essential Bash Scripting Tips and Practical Monitoring Scripts

This guide provides concise Bash scripting best practices, code snippets for generating random strings, color output functions, bulk user creation, package and service checks, host liveness verification, resource utilization monitoring, and website availability testing, all useful for system administration and interview preparation.

BashLinux monitoringShell scripting
0 likes · 5 min read
Essential Bash Scripting Tips and Practical Monitoring Scripts
Efficient Ops
Efficient Ops
Jan 4, 2017 · Information Security

How Deep Defense and Log Analysis Can Thwart Intrusions

This article explains Google’s BeyondCorp concept, the need for deep defense of internal and perimeter networks, and provides practical Linux scripts for monitoring processes, ports, command usage, system events, file changes, and SFTP activity to detect and mitigate host intrusions.

Deep DefenseLinux monitoringhost intrusion detection
0 likes · 10 min read
How Deep Defense and Log Analysis Can Thwart Intrusions
Java High-Performance Architecture
Java High-Performance Architecture
Mar 5, 2016 · Databases

Real‑Time Linux & MySQL Monitoring with OrzDBA

OrzDBA, a Perl‑based monitoring script from Taobao's DBA team, provides real‑time insight into Linux system metrics and MySQL performance indicators, offering commands to view load, CPU, swap, disk I/O, network traffic, and detailed MySQL statistics.

DBA toolsLinux monitoringOrzDBA
0 likes · 4 min read
Real‑Time Linux & MySQL Monitoring with OrzDBA
MaGe Linux Operations
MaGe Linux Operations
Jul 11, 2014 · Operations

Master dstat: Real-Time Linux System Monitoring Made Easy

Learn how to install, configure, and use dstat—a versatile, Python‑based Linux monitoring tool that replaces vmstat, iostat, netstat, and more—covering its features, command‑line options, plugins, CSV export, and real‑time performance insights for effective system administration.

CLICSV exportLinux monitoring
0 likes · 9 min read
Master dstat: Real-Time Linux System Monitoring Made Easy