Topic

Monitoring

Collection size
1711 articles
Page 85 of 86
Open Source Linux
Open Source Linux
Jan 13, 2025 · Operations

Key Lessons from 2024 Major Service Outages and How to Prevent Future Downtime

The article reviews major 2024 service outages—from Alibaba Cloud to OpenAI—highlights their root causes, and offers practical operations strategies such as disaster recovery, regular backups, load balancing, monitoring, performance tuning, and capacity planning to reduce future downtime.

Capacity PlanningLoad BalancingMonitoring
0 likes · 5 min read
Key Lessons from 2024 Major Service Outages and How to Prevent Future Downtime
Open Source Linux
Open Source Linux
Sep 19, 2024 · Operations

Mastering Linux Performance: From CPU/Memory Profiling to Flame Graphs

This guide explains how to systematically diagnose Linux performance issues using tools such as top, vmstat, perf, and flame graphs, covering CPU, memory, disk I/O, network, and load analysis, and demonstrates a real-world nginx case study with step‑by‑step commands and visualizations.

LinuxMonitoringflame graphs
0 likes · 21 min read
Mastering Linux Performance: From CPU/Memory Profiling to Flame Graphs
Open Source Linux
Open Source Linux
Sep 13, 2024 · Operations

Essential Bash Scripts for Server Monitoring, Automation, and Security

This article presents a collection of practical Bash scripts that cover file consistency checks, scheduled log management, network traffic monitoring, numeric analysis, FTP downloads, user input handling, Nginx 502 detection, variable assignments, bulk file renaming, text processing, port scanning, word filtering, command menus, SSH automation with Expect, user creation, Apache monitoring, password rotation, iptables rate‑limiting, and IP validation, providing sysadmins with ready‑to‑use solutions for everyday Linux operations.

LinuxMonitoringSecurity
0 likes · 25 min read
Essential Bash Scripts for Server Monitoring, Automation, and Security
Open Source Linux
Open Source Linux
Aug 23, 2024 · Operations

10 Proven Ops Practices to Prevent System Failures

This article shares ten practical operations strategies—including change rollbacks, safe handling of destructive commands, prompt customization, rigorous backup and verification, production environment discipline, careful handovers, robust alerting, cautious automatic failover, meticulous checks, and simplicity—to dramatically improve system reliability and availability.

LinuxMonitoringMySQL
0 likes · 17 min read
10 Proven Ops Practices to Prevent System Failures
Open Source Linux
Open Source Linux
Aug 5, 2024 · Operations

How to Manage Over 10,000 Network Devices with Systematic, Automated Operations

This guide outlines a comprehensive, automated strategy for operating more than ten thousand network devices, covering asset documentation, topology planning, unified monitoring, automation scripts, emergency response, security management, regular maintenance, staff training, and visual management tools.

MonitoringNetwork OperationsSecurity
0 likes · 6 min read
How to Manage Over 10,000 Network Devices with Systematic, Automated Operations
Open Source Linux
Open Source Linux
Aug 1, 2024 · Operations

Top 10 Essential Ops Tools Every Engineer Should Master

This article introduces ten indispensable tools for operations engineers, detailing each tool's functionality, ideal use cases, key advantages, and practical examples, while also providing code snippets and visual illustrations to help readers understand and apply them effectively.

Configuration ManagementInfrastructureMonitoring
0 likes · 8 min read
Top 10 Essential Ops Tools Every Engineer Should Master
Open Source Linux
Open Source Linux
Jun 27, 2024 · Operations

Comprehensive Guide to Building a Resilient, High‑Performance Web Infrastructure

This guide outlines essential steps for creating a robust, high‑availability website architecture, covering domain acquisition, DNS management, CDN deployment, image caching, data center selection, monitoring, DDoS mitigation, redundancy, server configuration, database replication, testing environments, security practices, and operational tooling.

DDoS protectionMonitoringOperations
0 likes · 12 min read
Comprehensive Guide to Building a Resilient, High‑Performance Web Infrastructure
Open Source Linux
Open Source Linux
Apr 11, 2024 · Operations

7 Practical Linux Performance Optimization Techniques Every Engineer Should Know

This article consolidates community‑sourced Linux performance optimization practices, covering key factors that affect system speed, rapid troubleshooting steps for CPU, memory, disk and network issues, load‑analysis methods, top‑resource identification commands, memory‑stat nuances, swap usage scenarios, and detailed TCP tuning recommendations.

LinuxMonitoringOptimization
0 likes · 20 min read
7 Practical Linux Performance Optimization Techniques Every Engineer Should Know
Open Source Linux
Open Source Linux
Dec 8, 2023 · Operations

Top 5 Log Management Tools Every DevOps Engineer Should Know

This article reviews five leading log management solutions—Graylog, LogDNA, ELK Stack, Grafana Loki, and Splunk—detailing their core components, key features, and why they are valuable for monitoring, troubleshooting, and securing modern IT environments.

DevOpsELK StackGrafana Loki
0 likes · 7 min read
Top 5 Log Management Tools Every DevOps Engineer Should Know
Open Source Linux
Open Source Linux
Dec 1, 2023 · Operations

10 Essential Ops Tools Every Engineer Should Master

This article introduces ten indispensable tools for operations engineers, detailing each tool's functionality, suitable scenarios, advantages, and real‑world examples, and includes practical code snippets to help automate, monitor, and manage infrastructure efficiently.

InfrastructureMonitoringOperations
0 likes · 8 min read
10 Essential Ops Tools Every Engineer Should Master
Open Source Linux
Open Source Linux
Nov 24, 2023 · Operations

Master Essential Linux Shell Commands for File Management, Monitoring, and Automation

This guide presents a collection of practical Linux shell commands and scripts for locating and moving files, batch extracting archives, using sed for text manipulation, checking directories, monitoring disk usage, analyzing logs, and configuring firewalls, all illustrated with clear examples and explanations.

File ManagementLinuxMonitoring
0 likes · 9 min read
Master Essential Linux Shell Commands for File Management, Monitoring, and Automation
Open Source Linux
Open Source Linux
Jul 27, 2023 · Operations

17 Essential Linux Ops Tricks to Boost Your Productivity

This article compiles seventeen practical Linux administration techniques—from batch file handling and directory checks to log analysis, disk monitoring, firewall rules, and network capture—each illustrated with ready‑to‑run shell commands and concise explanations for sysadmins.

LinuxMonitoringSysadmin
0 likes · 8 min read
17 Essential Linux Ops Tricks to Boost Your Productivity
Open Source Linux
Open Source Linux
Jul 4, 2023 · Operations

Master Redis Monitoring, Migration, and Cluster Management with Prometheus and CacheCloud

This guide walks through essential Redis operations, covering real‑time monitoring with the INFO command and Prometheus‑compatible exporters, data migration using Redis‑shake, consistency verification via Redis‑full‑check, and comprehensive cluster management with CacheCloud, providing practical tools for reliable Redis administration.

MonitoringOperationsPrometheus
0 likes · 11 min read
Master Redis Monitoring, Migration, and Cluster Management with Prometheus and CacheCloud
Open Source Linux
Open Source Linux
Jun 30, 2023 · Cloud Native

Essential Kubernetes Tools to Boost Your DevOps Workflow

This article reviews a curated set of open‑source Kubernetes tools—including Helm, Flagger, Kubewatch, Gitkube, kube‑state‑metrics, Kamus, Untrak, Scope, Dashboard, Kops, cAdvisor, Kubespray, K9s, Kubetail, PowerfulSeal, and Popeye—that enhance management, security, monitoring, and deployment within DevOps pipelines.

Cloud NativeDeploymentDevOps
0 likes · 11 min read
Essential Kubernetes Tools to Boost Your DevOps Workflow
Open Source Linux
Open Source Linux
Jun 21, 2023 · Cloud Native

From Monolith to Microservices: A Real‑World Journey and Lessons Learned

An online supermarket startup evolves its simple monolithic website into a fully distributed microservice architecture, detailing each transformation stage, the challenges encountered—such as code duplication, database bottlenecks, deployment complexity—and the solutions like service decomposition, monitoring, tracing, circuit breaking, and service mesh.

Distributed SystemsMicroservicesMonitoring
0 likes · 23 min read
From Monolith to Microservices: A Real‑World Journey and Lessons Learned
Open Source Linux
Open Source Linux
May 5, 2023 · Operations

Essential Ops Lessons from 3.5 Years of Real-World Crises

Drawing from three and a half years of operations work, this article shares hard‑earned best practices on testing, backups, security, monitoring, performance tuning, and the right mindset to avoid costly mistakes such as data loss, service outages, and security breaches.

LinuxMonitoringOperations
0 likes · 12 min read
Essential Ops Lessons from 3.5 Years of Real-World Crises
Open Source Linux
Open Source Linux
Mar 31, 2023 · Operations

Boost Your Ops Efficiency: 5 Python Scripts Every Engineer Should Know

This article explains how Python can automate common operations tasks—remote command execution, log parsing, system monitoring with alerts, batch software deployment, and backup/recovery—providing code examples and practical tips to improve efficiency and reduce manual errors.

DeploymentMonitoringPython
0 likes · 9 min read
Boost Your Ops Efficiency: 5 Python Scripts Every Engineer Should Know
Open Source Linux
Open Source Linux
Mar 9, 2023 · Operations

Prometheus vs Zabbix: Which Monitoring Tool Wins for Modern Ops?

An in‑depth comparison of Prometheus and Zabbix examines their histories, architectures, data storage, scalability, and container support, highlighting Prometheus’s cloud‑native pull model and Go‑based performance versus Zabbix’s mature, relational‑database approach, to help teams choose the right monitoring solution.

Cloud NativeMonitoringOperations
0 likes · 8 min read
Prometheus vs Zabbix: Which Monitoring Tool Wins for Modern Ops?
Open Source Linux
Open Source Linux
Dec 13, 2022 · Operations

Master Zabbix 6.2: New Features & Step‑by‑Step Deployment on CentOS 8

This guide introduces Zabbix 6.2’s latest features—including issue suppression, CyberArk vault integration, AWS EC2 templates, and enhanced proxy management—then provides a comprehensive, command‑line walkthrough for installing and configuring Zabbix 6.2 on a CentOS 8 server, covering prerequisites, database setup, web UI, and service startup.

CentOS8DevOpsInstallation
0 likes · 28 min read
Master Zabbix 6.2: New Features & Step‑by‑Step Deployment on CentOS 8
Open Source Linux
Open Source Linux
Dec 8, 2022 · Operations

Master Prometheus: From Metrics Collection to Alerting and Visualization

Prometheus is an open‑source monitoring solution that covers metric exposition, scraping, storage, querying, visualization, and alerting, and this guide walks through its architecture, configuration, custom exporters, PromQL queries, Grafana integration, and alert management, providing a comprehensive introduction for developers and ops engineers.

AlertingExporterGrafana
0 likes · 22 min read
Master Prometheus: From Metrics Collection to Alerting and Visualization