Master the Must‑Know Linux Commands Every Ops Engineer Needs
This comprehensive guide lists essential Linux commands for file handling, system monitoring, text processing, process control, network troubleshooting, compression, backup, security, and scripting, providing practical examples and interview tips to boost an operations engineer's efficiency and expertise.
1. File and Directory Operations
Basic commands for viewing and managing files include cat, more, less, head, and tail. Interviewers often ask about the differences between more and less — more only moves forward, while less supports forward and backward navigation, uses less memory, and offers search.
# View file content
cat /etc/passwd
more /var/log/messages
less /var/log/syslog
head -20 /var/log/nginx.log
tail -f /var/log/apache.logAdvanced file‑search commands include find, locate, which, and whereis for locating recent logs, large files, or binaries.
# Find log files modified in the last 7 days
find /var/log -name "*.log" -mtime -7
# Find files larger than 100 M
find /home -type f -size +100M
# Locate a file quickly (requires updatedb)
updatedb
locate nginx.conf
# Find command path
which python3
whereis nginxPermission management uses ls -la, chmod, chown, and chgrp, with special bits like sticky and SUID.
# View permissions
ls -la /etc/passwd
# Change mode
chmod 755 /usr/local/bin/script.sh
chmod u+x,g+r,o-w filename
# Change owner and group
chown nginx:nginx /var/www/html
chgrp www-data /var/log/nginx/
# Set sticky bit and SUID
chmod +t /tmp
chmod +s /usr/bin/passwd2. System Monitoring and Performance Analysis
2.1 Resource Monitoring
Use top or htop for real‑time CPU/memory, ps for process details, and free or cat /proc/meminfo for memory stats.
# Real‑time system view
top
htop
# Filter processes
ps aux | grep nginx
ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%cpu | head -10
# Memory usage
free -h
cat /proc/meminfo
vmstat 1 52.2 Disk Space Management
Check usage with df, explore directory sizes with du, and monitor I/O via iostat and iotop.
# Disk usage
df -h
du -sh /var/log/*
du -ah /home | sort -rh | head -20
# I/O stats
iostat -x 1
iotop2.3 Network Monitoring
Inspect ports with netstat or ss, list open files with lsof, and capture traffic using iftop, nethogs, or tcpdump.
# Port listening
netstat -tulpn
ss -tulpn
lsof -i :80
# Traffic monitoring
iftop
nethogs
tcpdump -i eth0 port 803. Text Processing and Log Analysis
3.1 The "Three Musketeers" of Text Processing
grepsearches, sed edits, and awk analyzes columns and patterns.
# grep examples
grep -r "error" /var/log/
grep -i "failed" /var/log/auth.log
grep -v "INFO" /var/log/app.log | head -20
grep -E "192\.168\.1\.[0-9]+" access.log
# sed examples
sed 's/old/new/g' file.txt
sed -n '10,20p' file.txt
sed -i 's/DEBUG/INFO/g' config.conf
sed '/^#/d' config.conf
# awk examples
awk '{print $1}' /var/log/nginx/access.log
awk -F: '{print $1}' /etc/passwd
awk '$3 > 100 {print $0}' data.txt3.2 Log Analysis in Practice
Combine awk, sort, uniq, and wc to find top IPs, count 404 errors, and summarize status codes.
# Top 10 IPs by request count
awk '{print $1}' access.log | sort | uniq -c | sort -nr | head -10
# Count 404 responses
awk '$9 == 404 {print $0}' access.log | wc -l
# Distribution of request times
awk '{print $4}' access.log | cut -d: -f2 | sort | uniq -c
# Status code frequencies
awk '{print $9}' access.log | sort | uniq -c | sort -nr4. Process Management and Service Control
4.1 Process Management
Terminate or query processes with kill, killall, pkill, and pgrep. Use nohup, jobs, bg, and fg for background tasks.
# Kill a process
kill -9 PID
killall nginx
pkill -f "python script"
pgrep -f nginx
# Background execution
nohup command &
jobs
bg %1
fg %14.2 Service Management (systemd)
Control services with systemctl and view logs via journalctl.
# Service actions
systemctl start nginx
systemctl stop nginx
systemctl restart nginx
systemctl reload nginx
systemctl enable nginx
systemctl disable nginx
systemctl status nginx
# Log inspection
journalctl -u nginx
journalctl -f -u nginx5. Network Configuration and Troubleshooting
5.1 IP and Routing
Show interfaces and routes with ip or legacy ifconfig and route.
# IP address and routes
ip addr show
ip route show
ip link show
# Legacy commands
ifconfig eth0
route -n
arp -a5.2 Connectivity Tests
Use ping, traceroute, mtr, telnet, and nc to verify reachability and ports.
# Basic connectivity
ping -c 4 google.com
traceroute google.com
mtr google.com
# Port checks
telnet 192.168.1.1 80
nc -zv 192.168.1.1 806. Compression and Backup
6.1 Archiving
Compress and extract with tar, zip, gzip, and gunzip.
# tar examples
tar -czf backup.tar.gz /var/www/
tar -xzf backup.tar.gz
tar -tzf backup.tar.gz
# Date‑stamped backup
tar -czf backup-$(date +%Y%m%d).tar.gz /etc/
# zip utilities
zip -r backup.zip /var/www/
unzip backup.zip
gzip file.txt
gunzip file.txt.gz6.2 Data Synchronization
Synchronize directories with rsync, optionally excluding patterns.
# Basic sync
rsync -avz /var/www/ user@remote:/backup/
# Delete extraneous files on destination
rsync -avz --delete /var/www/ /backup/
# Exclude logs
rsync -avz --exclude='*.log' /var/www/ /backup/7. System Security and User Management
7.1 User Management
Create, modify, and delete users with useradd, usermod, passwd, and userdel. Query information via id, who, w, and last.
# User operations
useradd -m -s /bin/bash username
usermod -aG sudo username
passwd username
userdel -r username
# View user info
id username
who
w
last7.2 Security Checks
Monitor authentication logs, search for failed logins, and verify file integrity with md5sum and sha256sum.
# Log monitoring
tail -f /var/log/auth.log
grep "Failed password" /var/log/auth.log
grep "sudo" /var/log/auth.log
# File integrity
md5sum file.txt
sha256sum file.txt8. Advanced Command Techniques
8.1 Pipelines and Compound Commands
Combine utilities to filter, sort, and act on data, e.g., killing nginx processes or extracting top request IPs.
# Kill nginx processes
ps aux | grep nginx | grep -v grep | awk '{print $2}' | xargs kill -9
# Top 10 IPs by GET requests
cat /var/log/nginx/access.log | grep "GET" | awk '{print $1}' | sort | uniq -c | sort -nr | head -10
# Find error logs and list details
find /var/log -name "*.log" -exec grep -l "error" {} \; | xargs ls -la8.2 Scripting for Automation
A sample Bash script gathers system information, checks disk and memory thresholds, and logs warnings.
#!/bin/bash
LOG_FILE="/var/log/health_check.log"
DATE=$(date '+%Y-%m-%d %H:%M:%S')
echo "[$DATE] Starting system health check" >> $LOG_FILE
# Disk usage warning
DISK_USAGE=$(df -h | grep -E "8[0-9]%|9[0-9]%|100%")
if [ -n "$DISK_USAGE" ]; then
echo "[$DATE] Warning: High disk usage" >> $LOG_FILE
echo "$DISK_USAGE" >> $LOG_FILE
fi
# Memory usage warning
MEM_USAGE=$(free | awk '/Mem/ {printf "%.2f", $3/$2 * 100}')
if (( $(echo "$MEM_USAGE > 90" | bc -l) )); then
echo "[$DATE] Warning: High memory usage: $MEM_USAGE%" >> $LOG_FILE
fi
# Load average warning
LOAD_AVG=$(uptime | awk -F'load average:' '{print $2}' | cut -d, -f1 | tr -d ' ')
if (( $(echo "$LOAD_AVG > 2.0" | bc -l) )); then
echo "[$DATE] Warning: High load average: $LOAD_AVG" >> $LOG_FILE
fi
echo "[$DATE] System health check completed" >> $LOG_FILE9. Interview Q&A Highlights
9.1 Performance Tuning
Check system load with uptime, cat /proc/loadavg, or w. Diagnose high CPU usage using top -p PID, strace -p PID, or perf top.
9.2 Storage Management
Identify large files with du -ah /var | sort -rh | head -20 or find /var -type f -size +100M -exec ls -lh {} \;. Monitor filesystem usage with df -h and watch changes via inotifywait -m /var/log/.
10. Practical Scenarios
10.1 Server Fault‑Isolation Workflow
Step‑by‑step checks: basic system info, process list, network listeners, and recent logs.
# System basics
uptime && free -h && df -h
# Process snapshot
ps aux | head -20
top -n 1 | head -20
# Network listeners
netstat -tulpn | grep LISTEN
ss -tulpn
# Log tail
tail -50 /var/log/messages
journalctl -xe10.2 Daily Maintenance Script
A Bash script (shown in section 8.2) performs health checks and records warnings.
Conclusion
Mastering these Linux commands equips operations engineers to excel in interviews and, more importantly, to troubleshoot efficiently, automate routine tasks, and maintain secure, high‑performing systems.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
