Topic

monitoring

Collection size
1767 articles
Page 87 of 89
Open Source Linux
Open Source Linux
Mar 11, 2022 · Operations

Essential Linux Ops Tools: Monitoring, Performance, and Security Utilities

This article presents a curated list of practical Linux operation tools—including Nethogs, IOzone, IOTop, IPtraf, IFTop, HTop, NMON, MultiTail, Fail2ban, Tmux, Agedu, NMap, and Httperf—detailing their purpose, download links, installation commands, and basic usage to help system administrators improve monitoring, performance testing, and security on Linux servers.

LinuxOperationsPerformance
0 likes · 12 min read
Essential Linux Ops Tools: Monitoring, Performance, and Security Utilities
Open Source Linux
Open Source Linux
Feb 10, 2022 · Operations

Unlock Linux Performance: Understanding Load, CPU Context Switches, and Memory Optimization

This guide explains Linux performance optimization by covering key metrics such as throughput, latency, load average, CPU context switching, and memory management, and demonstrates how to use built‑in tools like vmstat, pidstat, perf, and cachetop to diagnose and resolve bottlenecks.

CPULinuxPerformance
0 likes · 48 min read
Unlock Linux Performance: Understanding Load, CPU Context Switches, and Memory Optimization
Open Source Linux
Open Source Linux
Jan 5, 2022 · Operations

Designing Scalable High‑Availability Prometheus Architectures

This article explains how to build both small‑scale and large‑scale high‑availability Prometheus setups using local and remote storage, federation, keepalived, and PostgreSQL + TimescaleDB adapters to ensure reliable monitoring and alerting across growing infrastructures.

FederationHigh AvailabilityPrometheus
0 likes · 6 min read
Designing Scalable High‑Availability Prometheus Architectures
Open Source Linux
Open Source Linux
Nov 28, 2021 · Operations

Boost Linux Server Performance: 20 Proven Optimization Techniques

This guide presents twenty practical Linux server optimization methods—from kernel elevator tuning and daemon reduction to TCP tweaks, secure backups, and effective monitoring commands—helping administrators enhance reliability, speed, and security while reducing resource consumption.

Linuxkernelmonitoring
0 likes · 14 min read
Boost Linux Server Performance: 20 Proven Optimization Techniques
Open Source Linux
Open Source Linux
Nov 25, 2021 · Operations

How to Build a Full‑Stack Monitoring System with Prometheus, Grafana, and OneAlert

This guide walks you through installing Prometheus, configuring node_exporter and mysqld_exporter for remote Linux and MySQL monitoring, visualizing metrics with Grafana, and setting up multi‑level alerts using Grafana integrated with OneAlert for a robust 24/7 operations monitoring solution.

AlertingGrafanaMySQL Exporter
0 likes · 10 min read
How to Build a Full‑Stack Monitoring System with Prometheus, Grafana, and OneAlert
Open Source Linux
Open Source Linux
Nov 21, 2021 · Operations

Building a Scalable Prometheus Monitoring Stack with Thanos on Kubernetes

This article explains how to design and deploy a robust monitoring solution using Prometheus, Thanos, Pushgateway, and Alertmanager on Kubernetes, covering metric collection, naming conventions, query language, high‑availability strategies, and practical YAML configurations for a production‑grade observability platform.

AlertmanagerObservabilityPrometheus
0 likes · 20 min read
Building a Scalable Prometheus Monitoring Stack with Thanos on Kubernetes
Open Source Linux
Open Source Linux
Nov 14, 2021 · Databases

Essential Redis Monitoring Metrics Every Engineer Should Know

This guide outlines the key Redis monitoring metrics—including performance, memory, basic activity, persistence, and error indicators—explains their meanings, shows how to retrieve them with Redis commands, and provides practical tips for effective performance and health tracking.

MetricsPerformancePersistence
0 likes · 6 min read
Essential Redis Monitoring Metrics Every Engineer Should Know
Open Source Linux
Open Source Linux
Oct 31, 2021 · Operations

Designing Effective Metrics: From Requirements to Labels and Buckets

This guide explains how to define, name, and organize monitoring metrics—covering Google’s four golden indicators, system‑specific measurement objects, vector selection, label conventions, bucket design, and practical Grafana tips—for reliable observability of diverse services.

GrafanaMetricsObservability
0 likes · 10 min read
Designing Effective Metrics: From Requirements to Labels and Buckets
Open Source Linux
Open Source Linux
Oct 19, 2021 · Operations

Essential Ops Practices: Prevent Disasters with Backups, Security, and Monitoring

This guide shares practical Linux operations lessons—ranging from cautious command use, rigorous backup habits, and secure SSH configurations to comprehensive monitoring and performance tuning—to help teams avoid costly mistakes and maintain stable, reliable services.

BackupLinuxOperations
0 likes · 12 min read
Essential Ops Practices: Prevent Disasters with Backups, Security, and Monitoring
Open Source Linux
Open Source Linux
Oct 11, 2021 · Operations

10 Essential Ops Principles Every Engineer Should Follow

This article shares ten practical operations guidelines—from avoiding duplicated work and embracing mistakes to emphasizing monitoring, backup roles, clear division of labor, and continuous improvement—aimed at boosting reliability, efficiency, and team cohesion for both engineers and managers.

Operationsbest practicesmonitoring
0 likes · 10 min read
10 Essential Ops Principles Every Engineer Should Follow
Open Source Linux
Open Source Linux
Sep 27, 2021 · Databases

Why Redis Became a Bottleneck: Diagnosing High CPU with Slowlog and Command Stats

A Monday morning surge in user traffic exposed a Redis performance crisis, where CPU spiked to 100% due to massive keys* commands, and the investigation using Grafana, Redis info, commandstats, and slowlog revealed the root cause and a temporary mitigation strategy.

CommandStatsOperationsPerformance
0 likes · 5 min read
Why Redis Became a Bottleneck: Diagnosing High CPU with Slowlog and Command Stats
Open Source Linux
Open Source Linux
Sep 27, 2021 · Operations

Step-by-Step Guide to Installing Zabbix 5 on CentOS 7

This article provides a comprehensive, hands‑on tutorial for installing and configuring Zabbix 5 on CentOS 7, covering system overview, key terminology, disabling SELinux and firewalls, setting up repositories, installing server, agent, frontend, MariaDB, database initialization, configuration tweaks, and final web‑UI setup.

CentOSInstallationOperations
0 likes · 9 min read
Step-by-Step Guide to Installing Zabbix 5 on CentOS 7
Open Source Linux
Open Source Linux
Sep 6, 2021 · Operations

How to Diagnose Linux Server Issues in the First 60 Seconds with 10 Essential Commands

This article explains how Netflix's performance team uses ten standard Linux command‑line tools—uptime, dmesg, vmstat, mpstat, pidstat, iostat, free, sar, and top—to quickly assess system health, resource saturation and errors within the first minute of a performance incident.

LinuxServercommand line
0 likes · 18 min read
How to Diagnose Linux Server Issues in the First 60 Seconds with 10 Essential Commands
Open Source Linux
Open Source Linux
Aug 24, 2021 · Operations

Why Prometheus Became the Leading Cloud‑Native Monitoring Solution

This article explains how Prometheus evolved from a Google internal project to a CNCF‑graduated, top‑ranked time‑series database and full‑stack monitoring ecosystem, detailing its history, core features, architecture, and the roles of its components such as Exporters, Pushgateway, Service Discovery, and Alertmanager.

Cloud NativeObservabilityPrometheus
0 likes · 19 min read
Why Prometheus Became the Leading Cloud‑Native Monitoring Solution
Open Source Linux
Open Source Linux
Aug 26, 2021 · Cloud Native

Why Switch from Prometheus to Thanos? Boost Metric Retention & Cut Costs

This article explains the limitations of a traditional Prometheus‑based monitoring stack for Kubernetes, demonstrates how integrating Thanos improves metric retention, scalability, and storage cost, and provides a complete multi‑cluster deployment example with Terraform and Helm configurations.

Cloud NativeObservabilityPrometheus
0 likes · 15 min read
Why Switch from Prometheus to Thanos? Boost Metric Retention & Cut Costs
Open Source Linux
Open Source Linux
Jul 27, 2021 · Operations

How to Effectively Locate and Debug Production Issues Using Logs and Remote Debugging

This guide walks beginners through understanding logs, using them for error tracing, applying monitoring and alerts, and performing remote debugging to quickly pinpoint and resolve production problems, emphasizing practical steps and best practices for reliable system maintenance.

DebuggingOperationsTroubleshooting
0 likes · 7 min read
How to Effectively Locate and Debug Production Issues Using Logs and Remote Debugging
Open Source Linux
Open Source Linux
Jun 3, 2021 · Operations

Master Kubernetes Capacity Planning: Detect & Optimize Unused Resources

This guide explains Kubernetes capacity planning, showing how to detect idle CPU and memory, identify wasteful namespaces, use open‑source tools like kube‑state‑metrics and cAdvisor, and apply PromQL queries to optimize resource requests and measure the impact of your improvements.

PromQLcapacity planningkubernetes
0 likes · 10 min read
Master Kubernetes Capacity Planning: Detect & Optimize Unused Resources
Open Source Linux
Open Source Linux
May 28, 2021 · Fundamentals

What Is SNMP? A Complete Guide to Versions, Architecture, and Operations

This article explains the Simple Network Management Protocol (SNMP), covering its three versions, system architecture—including NMS, agents, managed objects, and MIB—along with query, set, trap, and inform operations, message formats, security levels, and default UDP ports used for network management.

MIBNetwork managementSNMP
0 likes · 16 min read
What Is SNMP? A Complete Guide to Versions, Architecture, and Operations
Open Source Linux
Open Source Linux
May 20, 2021 · Cloud Native

Stabilizing Unstable Kubernetes Clusters: CI/CD, Monitoring, Logging Blueprint

This article analyzes the root causes of a company's unstable Kubernetes clusters and presents a comprehensive solution covering a revamped CI/CD pipeline, federated monitoring and alerting, centralized logging, documentation practices, and clear traffic routing to achieve high reliability and stability.

CI/CDDevOpsOperations
0 likes · 10 min read
Stabilizing Unstable Kubernetes Clusters: CI/CD, Monitoring, Logging Blueprint
Open Source Linux
Open Source Linux
May 17, 2021 · Cloud Native

Master Docker: From Fundamentals to Advanced Container Management

This comprehensive guide walks you through Docker's core concepts, installation on multiple operating systems, image handling, container lifecycle commands, building web services with Apache, Nginx, Python and MySQL, and advanced monitoring techniques using cAdvisor, Prometheus, Grafana, and Kubernetes, providing practical examples and command‑line snippets for each step.

Cloud NativeContainersDevOps
0 likes · 49 min read
Master Docker: From Fundamentals to Advanced Container Management