Operations 18 min read

Master the Must‑Know Linux Commands Every Ops Engineer Needs

This comprehensive guide lists essential Linux commands for file handling, system monitoring, text processing, process control, network troubleshooting, compression, backup, security, and scripting, providing practical examples and interview tips to boost an operations engineer's efficiency and expertise.

Raymond Ops
Raymond Ops
Raymond Ops
Master the Must‑Know Linux Commands Every Ops Engineer Needs

1. File and Directory Operations

Basic commands for viewing and managing files include cat, more, less, head, and tail. Interviewers often ask about the differences between more and lessmore only moves forward, while less supports forward and backward navigation, uses less memory, and offers search.

# View file content
cat /etc/passwd
more /var/log/messages
less /var/log/syslog
head -20 /var/log/nginx.log
tail -f /var/log/apache.log

Advanced file‑search commands include find, locate, which, and whereis for locating recent logs, large files, or binaries.

# Find log files modified in the last 7 days
find /var/log -name "*.log" -mtime -7
# Find files larger than 100 M
find /home -type f -size +100M
# Locate a file quickly (requires updatedb)
updatedb
locate nginx.conf
# Find command path
which python3
whereis nginx

Permission management uses ls -la, chmod, chown, and chgrp, with special bits like sticky and SUID.

# View permissions
ls -la /etc/passwd
# Change mode
chmod 755 /usr/local/bin/script.sh
chmod u+x,g+r,o-w filename
# Change owner and group
chown nginx:nginx /var/www/html
chgrp www-data /var/log/nginx/
# Set sticky bit and SUID
chmod +t /tmp
chmod +s /usr/bin/passwd

2. System Monitoring and Performance Analysis

2.1 Resource Monitoring

Use top or htop for real‑time CPU/memory, ps for process details, and free or cat /proc/meminfo for memory stats.

# Real‑time system view
top
htop
# Filter processes
ps aux | grep nginx
ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%cpu | head -10
# Memory usage
free -h
cat /proc/meminfo
vmstat 1 5

2.2 Disk Space Management

Check usage with df, explore directory sizes with du, and monitor I/O via iostat and iotop.

# Disk usage
df -h
du -sh /var/log/*
du -ah /home | sort -rh | head -20
# I/O stats
iostat -x 1
iotop

2.3 Network Monitoring

Inspect ports with netstat or ss, list open files with lsof, and capture traffic using iftop, nethogs, or tcpdump.

# Port listening
netstat -tulpn
ss -tulpn
lsof -i :80
# Traffic monitoring
iftop
nethogs
tcpdump -i eth0 port 80

3. Text Processing and Log Analysis

3.1 The "Three Musketeers" of Text Processing

grep

searches, sed edits, and awk analyzes columns and patterns.

# grep examples
grep -r "error" /var/log/
grep -i "failed" /var/log/auth.log
grep -v "INFO" /var/log/app.log | head -20
grep -E "192\.168\.1\.[0-9]+" access.log
# sed examples
sed 's/old/new/g' file.txt
sed -n '10,20p' file.txt
sed -i 's/DEBUG/INFO/g' config.conf
sed '/^#/d' config.conf
# awk examples
awk '{print $1}' /var/log/nginx/access.log
awk -F: '{print $1}' /etc/passwd
awk '$3 > 100 {print $0}' data.txt

3.2 Log Analysis in Practice

Combine awk, sort, uniq, and wc to find top IPs, count 404 errors, and summarize status codes.

# Top 10 IPs by request count
awk '{print $1}' access.log | sort | uniq -c | sort -nr | head -10
# Count 404 responses
awk '$9 == 404 {print $0}' access.log | wc -l
# Distribution of request times
awk '{print $4}' access.log | cut -d: -f2 | sort | uniq -c
# Status code frequencies
awk '{print $9}' access.log | sort | uniq -c | sort -nr

4. Process Management and Service Control

4.1 Process Management

Terminate or query processes with kill, killall, pkill, and pgrep. Use nohup, jobs, bg, and fg for background tasks.

# Kill a process
kill -9 PID
killall nginx
pkill -f "python script"
pgrep -f nginx
# Background execution
nohup command &
jobs
bg %1
fg %1

4.2 Service Management (systemd)

Control services with systemctl and view logs via journalctl.

# Service actions
systemctl start nginx
systemctl stop nginx
systemctl restart nginx
systemctl reload nginx
systemctl enable nginx
systemctl disable nginx
systemctl status nginx
# Log inspection
journalctl -u nginx
journalctl -f -u nginx

5. Network Configuration and Troubleshooting

5.1 IP and Routing

Show interfaces and routes with ip or legacy ifconfig and route.

# IP address and routes
ip addr show
ip route show
ip link show
# Legacy commands
ifconfig eth0
route -n
arp -a

5.2 Connectivity Tests

Use ping, traceroute, mtr, telnet, and nc to verify reachability and ports.

# Basic connectivity
ping -c 4 google.com
traceroute google.com
mtr google.com
# Port checks
telnet 192.168.1.1 80
nc -zv 192.168.1.1 80

6. Compression and Backup

6.1 Archiving

Compress and extract with tar, zip, gzip, and gunzip.

# tar examples
tar -czf backup.tar.gz /var/www/
tar -xzf backup.tar.gz
tar -tzf backup.tar.gz
# Date‑stamped backup
tar -czf backup-$(date +%Y%m%d).tar.gz /etc/
# zip utilities
zip -r backup.zip /var/www/
unzip backup.zip
gzip file.txt
gunzip file.txt.gz

6.2 Data Synchronization

Synchronize directories with rsync, optionally excluding patterns.

# Basic sync
rsync -avz /var/www/ user@remote:/backup/
# Delete extraneous files on destination
rsync -avz --delete /var/www/ /backup/
# Exclude logs
rsync -avz --exclude='*.log' /var/www/ /backup/

7. System Security and User Management

7.1 User Management

Create, modify, and delete users with useradd, usermod, passwd, and userdel. Query information via id, who, w, and last.

# User operations
useradd -m -s /bin/bash username
usermod -aG sudo username
passwd username
userdel -r username
# View user info
id username
who
w
last

7.2 Security Checks

Monitor authentication logs, search for failed logins, and verify file integrity with md5sum and sha256sum.

# Log monitoring
tail -f /var/log/auth.log
grep "Failed password" /var/log/auth.log
grep "sudo" /var/log/auth.log
# File integrity
md5sum file.txt
sha256sum file.txt

8. Advanced Command Techniques

8.1 Pipelines and Compound Commands

Combine utilities to filter, sort, and act on data, e.g., killing nginx processes or extracting top request IPs.

# Kill nginx processes
ps aux | grep nginx | grep -v grep | awk '{print $2}' | xargs kill -9
# Top 10 IPs by GET requests
cat /var/log/nginx/access.log | grep "GET" | awk '{print $1}' | sort | uniq -c | sort -nr | head -10
# Find error logs and list details
find /var/log -name "*.log" -exec grep -l "error" {} \; | xargs ls -la

8.2 Scripting for Automation

A sample Bash script gathers system information, checks disk and memory thresholds, and logs warnings.

#!/bin/bash
LOG_FILE="/var/log/health_check.log"
DATE=$(date '+%Y-%m-%d %H:%M:%S')

echo "[$DATE] Starting system health check" >> $LOG_FILE

# Disk usage warning
DISK_USAGE=$(df -h | grep -E "8[0-9]%|9[0-9]%|100%")
if [ -n "$DISK_USAGE" ]; then
  echo "[$DATE] Warning: High disk usage" >> $LOG_FILE
  echo "$DISK_USAGE" >> $LOG_FILE
fi

# Memory usage warning
MEM_USAGE=$(free | awk '/Mem/ {printf "%.2f", $3/$2 * 100}')
if (( $(echo "$MEM_USAGE > 90" | bc -l) )); then
  echo "[$DATE] Warning: High memory usage: $MEM_USAGE%" >> $LOG_FILE
fi

# Load average warning
LOAD_AVG=$(uptime | awk -F'load average:' '{print $2}' | cut -d, -f1 | tr -d ' ')
if (( $(echo "$LOAD_AVG > 2.0" | bc -l) )); then
  echo "[$DATE] Warning: High load average: $LOAD_AVG" >> $LOG_FILE
fi

echo "[$DATE] System health check completed" >> $LOG_FILE

9. Interview Q&A Highlights

9.1 Performance Tuning

Check system load with uptime, cat /proc/loadavg, or w. Diagnose high CPU usage using top -p PID, strace -p PID, or perf top.

9.2 Storage Management

Identify large files with du -ah /var | sort -rh | head -20 or find /var -type f -size +100M -exec ls -lh {} \;. Monitor filesystem usage with df -h and watch changes via inotifywait -m /var/log/.

10. Practical Scenarios

10.1 Server Fault‑Isolation Workflow

Step‑by‑step checks: basic system info, process list, network listeners, and recent logs.

# System basics
uptime && free -h && df -h
# Process snapshot
ps aux | head -20
top -n 1 | head -20
# Network listeners
netstat -tulpn | grep LISTEN
ss -tulpn
# Log tail
tail -50 /var/log/messages
journalctl -xe

10.2 Daily Maintenance Script

A Bash script (shown in section 8.2) performs health checks and records warnings.

Conclusion

Mastering these Linux commands equips operations engineers to excel in interviews and, more importantly, to troubleshoot efficiently, automate routine tasks, and maintain secure, high‑performing systems.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

OperationsLinuxSystem AdministrationShell scripting
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.