Operations 12 min read

Essential Linux Ops Practices: Prevent Disasters and Boost Stability

Drawing from three and a half years of Linux operations, this guide outlines practical standards for testing, confirming commands, avoiding concurrent edits, mandatory backups, data safety, security hardening, continuous monitoring, performance tuning, and the right mindset to keep production environments stable and secure.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Essential Linux Ops Practices: Prevent Disasters and Boost Stability

Introduction

The author shares personal incidents from three‑plus years of Linux operations—data loss, server hijacking, accidental deletions—and distills lessons into actionable best‑practice guidelines.

1. Online Operation Standards

Test before using: Perform all experiments on virtual machines first; avoid transferring habits like frequent snapshot restores to production.

Confirm before pressing Enter: Mistyped commands such as rm -rf /var or reversed rsync paths can erase critical data instantly.

Avoid multi‑person edits: Simultaneous changes by several operators lead to configuration drift and confusion.

Backup before any change: Always copy configuration files (e.g., .conf) and keep database snapshots; a missing backup turns a simple mistake into a disaster.

2. Data Handling

Use rm -rf with extreme caution: A single slip can delete entire databases; verify intent rigorously.

Prioritize backups: The author’s companies performed full backups every two hours (payment gateway) or every 20 minutes (loan platform).

Stability over speed: Prefer proven, stable software stacks in production; avoid untested versions of Nginx, PHP‑FPM, etc.

Confidentiality: Protect sensitive data and prevent leaks through proper access controls and network hardening.

3. Security Measures

Change the default SSH port (recognizing that determined attackers can still scan).

Disable direct root login.

Adopt a normal‑user + key‑based authentication + sudo rules + IP restrictions model.

Deploy intrusion‑prevention tools (e.g., hostdeny) to block repeated brute‑force attempts.

Audit /etc/passwd and other critical accounts.

Enable a firewall with a default‑deny policy, opening only required ports.

Apply least‑privilege principles: run services as non‑root users and limit permissions.

Use third‑party intrusion detection and log monitoring to watch critical files ( /etc/passwd, /etc/my.cnf, etc.) and log streams ( /var/log/secure, /etc/log/message).

4. Daily Monitoring

System health: Track CPU, memory, disk, network, and OS login activity.

Service health: Monitor web, database, and load‑balancer metrics to spot performance bottlenecks early.

Log monitoring: Collect hardware, OS, and application error logs; alerts become crucial when issues arise.

5. Performance Tuning

Understand the underlying execution mechanisms of software (e.g., why Nginx outperforms Apache) before tweaking parameters.

Follow a tuning framework: diagnose the bottleneck, analyze logs, define a direction, then adjust OS/hardware before touching database configs.

Change only one parameter at a time to isolate effects.

Conduct benchmark tests to verify improvements and ensure they reflect real‑world workloads.

6. Ops Mindset

Control emotions: Avoid making critical changes when stressed; pause if you feel rushed.

Take responsibility for data: Recognize that production data is not a playground; lack of backup has severe consequences.

Root‑cause analysis: After fixing an issue, investigate why it happened (e.g., OOM kills, MySQL bugs).

Separate test and production environments: Verify operations on test machines and limit open terminals during critical tasks.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

monitoringOperationsLinuxbest practicesBackup
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.