5 Game-Changing One-Liner Shell Commands Every Ops Engineer Should Know
This article shares five practical one‑line Shell commands—covering bulk health checks, rapid log analysis, process ranking, network diagnostics, and precise disk cleanup—each explained with its scenario, inner workings, and real‑world performance impact for production environments.
As an experienced operations engineer, the author presents five battle‑tested one‑line Shell commands that can resolve critical incidents quickly, turning complex troubleshooting into simple data‑flow pipelines.
Weapon 1: Bulk Server Health‑Check
Scenario: At 3 am an alarm fires and 100 servers must be inspected.
for ip in $(cat servers.txt); do echo -n "$ip: "; timeout 5 ssh -o ConnectTimeout=3 $ip "uptime | awk '{print $3,$4,$5}' && free -h | grep Mem | awk '{print \"Mem:\",$3\"/\"$2}' && df -h / | tail -1 | awk '{print \"Disk:\",$5}'" 2>/dev/null || echo "UNREACHABLE"; done timeout 5prevents SSH from hanging. ConnectTimeout=3 skips unresponsive nodes.
Collects load, memory and disk usage in a single pass. 2>/dev/null silences error output.
Effect: Health check of 100 servers finishes in about 3 minutes, roughly 20 times faster than traditional scripts.
Weapon 2: Instant Log‑Analysis for Slow Requests
Scenario: Application response is slow; need to pinpoint abnormal requests in massive logs.
tail -f /var/log/nginx/access.log | awk '$10 > 5000 {print strftime("%H:%M:%S"), $1, $7, $10"ms", $9}' | while read line; do echo -e "\033[31m$line\033[0m"; done $10 > 5000filters requests taking more than 5 seconds. strftime("%H:%M:%S") adds a real‑time timestamp. \033[31m highlights anomalies in red.
Processes logs continuously without waiting for rotation.
Effect: From gigabytes of logs the tool extracts slow requests instantly, cutting average fault‑location time from 30 minutes to about 2 minutes.
Weapon 3: Process Resource Ranking
Scenario: CPU spikes on a server; need to find the biggest resource consumers.
ps aux --sort=-%cpu,%mem | awk 'NR<=11{printf "%-8s %-6s %-6s %-10s %s
", $1, $3"%", $4"%", $2, $11}' | column -t --sort=-%cpu,%memorders processes by CPU and memory usage descending. NR<=11 shows the top 10 processes plus header. printf formats output for readability. column -t aligns columns automatically.
Effect: Identifies resource‑hungry processes in seconds, eliminating the need for repeated top inspections.
Weapon 4: Quick Network Connection Diagnosis
Scenario: Application connections surge; need to analyze connection‑state distribution.
netstat -an | awk '/^tcp/ {++state[$6]} END {for(key in state) printf "%-12s %s
", key, state[key]}' | sort -k2 -nrCounts all TCP connection states. ++state[$6] increments the counter for the state column. sort -k2 -nr orders states by number of connections.
Provides a one‑liner alternative to complex scripts.
Effect: Instantly reveals connection distribution, helping detect leaks or DDoS attacks.
Weapon 5: Precise Disk‑Space Cleanup
Scenario: Disk space is critically low; need to locate and safely delete large files.
find /var/log -type f -size +100M -exec ls -lh {} + | awk '{print $5, $9}' | sort -hr | head -20 | while read size file; do echo "$size $file"; read -p "Delete? (y/N): " answer; [[ $answer == "y" ]] && rm "$file" && echo "Deleted: $file"; done findlocates files larger than 100 MB. ls -lh shows human‑readable sizes. sort -hr orders files by size descending.
Interactive confirmation prevents accidental mass deletion.
Effect: Accurately targets space‑hogs while ensuring safety, avoiding risky bulk deletions.
Ops Mindset: Core Principles
Pipe Thinking
One‑liner Shell commands embody pipeline thinking: each command is a processing node, allowing complex problems to be broken into simple data‑flow steps.
Error Handling
Use timeout to avoid hangs.
Redirect errors with 2>/dev/null.
Provide fallback logic with ||.
Performance Optimization
Prefer built‑in utilities (awk, grep, sed).
Avoid unnecessary process creation.
Leverage caching and pipelines wisely.
Advanced Refinement Guide
Tip 1: Alias for Speed
alias healthcheck='for ip in $(cat servers.txt); do echo -n "$ip: "; timeout 5 ssh -o ConnectTimeout=3 $ip "uptime | awk '\''{print $3,$4,$5}'\'' && free -h | grep Mem | awk '\''{print \"Mem:\",$3\"/\"$2}'\'' && df -h / | tail -1 | awk '\''{print \"Disk:\",$5}'\''" 2>/dev/null || echo "UNREACHABLE"; done'Tip 2: Parameterized Function
log_monitor() {
local threshold=${1:-5000}
tail -f /var/log/nginx/access.log | awk "\$10 > $threshold {print strftime(\"%H:%M:%S\"), \$1, \$7, \$10\"ms\", \$9}"
}Tip 3: Persist Results
echo "=== Health Check $(date) ===" >> health_report.log
# Execute health‑check command and append output to the logMaintain a personal toolbox of these snippets, understand each flag’s purpose, test in staging before production, and continuously tune parameters to match real‑world workloads.
Repository references (technical): https://github.com/raymond999999 https://gitee.com/raymond9
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
