10 Essential Linux Commands Every Sysadmin Must Master
This guide walks system administrators through the ten most frequently used Linux commands—top/htop, df/du, free, ss/netstat, ping/traceroute, ps/kill, grep/sed/awk, tail/less, uname/hostname/uptime, and tar/rsync—explaining core options, output interpretation, common pitfalls, and practical troubleshooting scenarios.
1. Process and Resource Monitoring: top / htop
1.1 top command
topis the basic real‑time process monitor; it refreshes every 3 seconds by default and shows overall resource usage and a process list.
# Basic usage: run directly, press q to quit
top
# Common options
top -d 1 # refresh every second (default 3 s)
top -p 1234 # monitor a specific PID
top -u www-data # show processes of a specific user
top -b -n 5 # batch mode, exit after 5 updates (useful in scripts)Typical output parsing:
top - 14:23:45 up 12 days, 3:22, 2 users, load average: 0.52, 0.48, 0.41
Tasks: 142 total, 1 running, 141 sleeping, 0 stopped, 0 zombie
%Cpu(s): 5.2 us, 2.1 sy, 0.0 ni, 92.1 id, 0.0 wa, 0.0 hi, 0.6 si, 0.0 st
MiB Mem : 18272.4 total, 8456.2 free, 7234.1 used, 2582.1 buff/cache
MiB Swap: 4096.0 total, 4096.0 free, 0.0 used. 9982.3 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12345 root 20 0 2.423g 345.2m 12.3m S 8.2 1.9 2:34.56 python
12346 www-data 20 0 512.0m 45.1m 9.8m S 2.1 0.2 0:01.23 nginxKey columns :
PR : priority; NI negative makes PR smaller (higher priority)
VIRT : virtual memory size; excessively large values may indicate a memory leak
RES : resident memory (actual physical consumption)
SHR : shared memory (libraries shared among processes)
%CPU : CPU usage; sustained >100% shows multi‑core parallelism
TIME+ : cumulative CPU time; very high values merit investigation
Interactive keys (while top runs) : M: sort by memory % P: sort by CPU % c: show full command line and arguments k: kill a specific PID 1: toggle per‑CPU usage view H: show threads instead of processes
1.2 htop command
If the system supports it, install htop as a friendlier alternative to top:
# Ubuntu/Debian
sudo apt-get install htop
# CentOS/RHEL
sudo yum install htop
# Core functions
htop # start
htop -u www-data # filter by user
htop -p 1234,5678 # monitor specific PIDs
F9 -> select SIGKILL -> Enter # kill via UI2. Disk Usage Analysis: df / du
2.1 df – Disk space overview
# Basic usage
df -h # human‑readable sizes (KB/MB/GB)
df -T # show filesystem type
df -i # show inode usage (useful for certain failures)
# Typical output
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda1 ext4 100G 45G 55G 45% /
/dev/sdb1 xfs 500G 320G 180G 64% /data
tmpfs tmpfs 7.8G 12M 7.8G 1% /dev/shmCommon problem checks: Use% > 90% – disk space is tight, clean up needed. IUse% > 90% – inode exhaustion; even with free space, new files cannot be created (common with many tiny files).
2.2 du – Directory/file size analysis
# Show total size of current directory
du -sh .
# Show size of each subdirectory, sorted
du -h --max-depth=1 /var/log | sort -h
# Show top 10 largest files (recursive)
du -ah /home | sort -rh | head -10
# Show only directories, not files
du -sh /var/*/ | sort -rh
# Parameter explanations
# -s: summarize (total only)
# -h: human‑readable
# -a: include files and directories
# --max-depth: limit directory depthWhen a disk‑space alarm fires, use du to locate large directories:
# Start from root and drill down
du -h --max-depth=1 / | sort -h
# Assume /var is largest, continue deeper
du -h --max-depth=1 /var | sort -h
# Continue until the offending directory or file is identified3. Memory Diagnosis: free
# Basic usage
free -h # human‑readable format
free -m # display in MB
free -s 5 # refresh every 5 seconds
free -c 10 -s 5 # refresh 10 times then exit
# Typical output
total used free shared buff/cache available
Mem: 32Gi 8.5Gi 12Gi 200Mi 11Gi 23Gi
Swap: 4.0Gi 0Bi 4.0GiKey observations:
available vs free : Linux uses idle memory for buffers/cache; available reflects truly usable memory.
swap used > 0 : indicates physical memory shortage; the system is swapping, which degrades performance.
buff/cache growth : normal; if it approaches total memory, memory allocation is reasonable.
# Monitor memory trend (once per second)
watch -n 1 free -h4. Network Connection and Diagnosis: ss / netstat / ping / traceroute
4.1 ss – Socket statistics (recommended replacement for netstat)
# Show all listening TCP sockets
ss -tlnp
# Show non‑listening TCP sockets
ss -tnp
# Show UDP sockets
ss -unp
# Find which process occupies a specific port (e.g., HTTP)
ss -tlnp | grep :80
# Find MySQL port usage
ss -tlnp | grep :3306
# Summary statistics
ss -s
# Example output:
# Total: 245 (kernel 248)
# TCP: 123 (established 45, closed 12, orphaned 0, synrecv 0, timewait 0)
# Filter by state
ss state established # established connections
ss state time-wait # many TIME_WAIT may need attention
ss state syn-sent # half‑open connections, possible attack or service issue4.2 netstat (still used in some scenarios)
# Routing table
netstat -rn # -r: route table, -n: numeric output (no DNS lookup)
# Equivalent: ip route
# Interface statistics
netstat -i # network interface stats, same as: ip -s link
# Port usage (ss is preferred)
ss -tlnp | grep :4434.3 ping – Connectivity test
# Basic usage
ping -c 4 8.8.8.8 # send 4 ICMP packets then exit
ping -c 100 -i 0.2 8.8.8.8 # rapid test of 100 packets (packet‑loss measurement)
# Interpreting output
# 64 bytes from 8.8.8.8: icmp_seq=1 ttl=117 time=12.3 ms
# time > 100 ms indicates high latency
# "100% packet loss" means total connectivity failure4.4 traceroute – Route tracing
# Trace each hop to the target (max 15 hops)
traceroute -m 15 8.8.8.8
# IPv6 equivalent
tracert 2001:4860:4860::8888
# Common commercial use cases (e.g., Cloudflare/Google DNS)
traceroute 1.1.1.1
traceroute 8.8.8.85. Process Management: ps / kill
5.1 ps – Process snapshot
# Common combos: show all processes with full format
ps aux # a: all users, u: detailed, x: include processes without a terminal
ps -ef # -e: all processes, -f: full format (easier to read)
# Custom column output, sorted by CPU usage, top 20
ps -eo pid,user,%cpu,%mem,cmd --sort=-%cpu | head -20
# Find a specific process
ps aux | grep nginx
ps -ef | grep "[p]ython" # [] avoids matching the grep itself
# Show processes of a specific user
ps -u www-data -o pid,%cpu,%mem,cmdDifference between ps and top: ps provides a snapshot at execution time, while top updates continuously.
5.2 kill – Terminate a process
# Basic signals
kill 12345 # send SIGTERM (15), graceful termination
kill -9 12345 # send SIGKILL (9), forceful termination, cannot be caught
kill -15 12345 # explicit SIGTERM
# List available signals
kill -l
# Common signal cheat sheet
# 1 (HUP): reload configuration (e.g., nginx -s reload)
# 9 (KILL): force kill, cannot be caught
# 15 (TERM): normal termination, process can clean up
# 20 (TSTP): pause (Ctrl+Z)5.3 killall / pkill – Bulk termination
killall nginx # terminate all nginx processes
pkill -u www-data nginx # terminate nginx processes owned by www-data
killall -9 -Z nginx # force kill all nginx processes, including zombiesDangerous operation: on production servers, always verify that a process can be safely restarted before running killall -9.
6. File and Text Processing: grep / sed / awk
6.1 grep – Text search
# Basic search
grep "error" /var/log/syslog
# Recursive search
grep -r "Exception" /var/log/
# Highlight matches
grep --color=auto "error" file.log
# Show line numbers
grep -n "error" file.log
# Exclude certain content
grep -v "DEBUG" file.log # filter out DEBUG lines
grep "ERROR" file.log | grep -v "AuthError" # exclude AuthError from ERROR lines
# Count matching lines
grep -c "error" file.log # returns number of matching lines
# Count occurrences
grep -o "error" file.log | wc -l
# Extended regex
grep -E "error|warning|fatal" file.log
# Match date pattern
grep -E "\d{4}-\d{2}-\d{2}" file.log6.2 sed – Stream editor
# Replace (most common)
sed -i 's/old/new/g' file.txt # -i edits file in place
sed 's/error/ERROR/g' file.txt > new.txt # output to new file
# Print specific lines
sed -n '10,20p' file.log # print lines 10‑20
sed -n '/ERROR/,/WARN/p' file.log # print from ERROR up to WARN
# Delete matching lines
sed -i '/DEBUG/d' file.log # remove all lines containing DEBUG
# Batch replace in multiple files with backup
sed -i.bak 's/old/new/g' *.txt # creates .bak backup files6.3 awk – Text analysis
# Print specific columns
awk '{print $1, $3}' file.txt # print column 1 and 3
# Use a field separator
awk -F: '{print $1, $7}' /etc/passwd # colon‑separated fields
# Conditional filtering
awk '$3 > 100 {print $1, $3}' file.txt # rows where column 3 > 100
# Aggregate calculation
awk '{sum += $3} END {print sum}' file.txt
# Log analysis example: count HTTP status codes
awk '{print $9}' access.log | sort | uniq -c | sort -rn7. Log Viewing: tail / less
7.1 tail – View file end
# Show last 100 lines
tail -n 100 /var/log/syslog
# Follow mode (real‑time)
tail -f /var/log/syslog
# Follow a specific log
tail -f /var/log/nginx/access.log
# Combined usage
# Skip first 1000 lines, then show next 100
tail -n +1000 access.log | head -100
# Real‑time view but filter only errors
tail -f error.log | grep -E "ERROR|WARN"7.2 less – Paginated view
less /var/log/syslog
# Inside less:
# /keyword – forward search
# ?keyword – backward search
# n – next match
# N – previous match
# g – go to start of file
# G – go to end of file
# q – quit
# -N – show line numbers7.3 Combined tricks
# Real‑time error log monitoring
tail -f /var/log/nginx/error.log | grep -E "error|critical"
# Monitor multiple log files (requires multitail)
multitail -e "error" /var/log/app1.log -e "ERROR" /var/log/app2.log
# systemd journalctl examples
journalctl -u nginx --since "1 hour ago" # logs from last hour
journalctl -f -u nginx # follow mode
journalctl -p err # show only error‑level entries8. System Information: uname / hostname / uptime
# System information
uname -a # kernel version, architecture, compile time
# Example output:
# Linux srv-001 5.15.0-91-generic #101-Ubuntu SMP x86_64 GNU/Linux
# Kernel version shortcut
cat /proc/version
# Hostname
hostname
hostname -I # all IP addresses
# System uptime
uptime
# Example: 14:23:45 up 30 days, 3:22, 2 users, load average: 0.52, 0.48, 0.41
# CPU information
cat /proc/cpuinfo | grep "model name" | head -1
nproc # number of CPU cores
lscpu # detailed CPU info
# Load average interpretation
# load average: 0.52, 0.48, 0.41 (1‑min, 5‑min, 15‑min)
# Load > number of CPU cores indicates queuing9. User and Session Management: who / w / last
# Current logged‑in users
who
# Example output:
# root pts/0 2025-04-27 10:00 (192.168.1.100)
# www-data pts/1 2025-04-27 10:05 (192.168.1.101)
# Detailed login info (including running commands)
w
# Login history
last # all login/logout records
last reboot # system reboot history
last -n 10 # most recent 10 entries
last -x | grep shutdown # shutdown events
# Failed login attempts (requires root)
lastb10. Archiving and Compression: tar / rsync
10.1 tar – Archiving tool
# Create archive
tar -cvf archive.tar /data/dir # -c create, -v verbose, -f file name
tar -czvf archive.tar.gz /data/dir # gzip compression
# Extract archive
tar -xvf archive.tar # extract to current directory
tar -xzvf archive.tar.gz -C /target/ # extract to specific directory
# List contents without extracting
tar -tzvf archive.tar.gz
# Other compression formats
tar -cajf archive.tar.bz2 /data/dir # bzip2 (higher compression)
tar -cJf archive.tar.xz /data/dir # xz (highest compression)10.2 rsync – Incremental sync
# Local sync
rsync -av /source/ /target/ # -a archive mode, -v verbose
# Remote sync
rsync -avz -e ssh /source/ user@host:/target/
# Common parameter combos
rsync -avz --delete /source/ user@host:/target/ # delete files on target that no longer exist on source
rsync -avz --exclude '*.log' /source/ /target/ # exclude log files
rsync -avz --dry-run /source/ user@host:/target/ # simulate run (no actual copy)
# Typical incremental backup via cron
0 2 * * * rsync -a --delete /data/ /backup/data_$(date +\%Y\%m\%d)/Quick‑Reference Troubleshooting
Service response slow : run top or htop to check if CPU or memory is saturated.
Disk space alert : run df -h to identify which partition is full.
Port not reachable : run ss -tlnp | grep :<port> to verify whether the service is listening.
Network latency high : combine ping and traceroute to determine if the delay is local or in the network path.
Process disappeared : run ps aux | grep <process_name> to inspect process status.
Log appending slowly : use tail -f to observe log output in real time.
Login failure : run who and last to check for abnormal login attempts.
Memory leak : run free -h and ps aux --sort=-%rss to locate the process consuming the most memory.
Configuration file error : run grep -n error /var/log/... to search logs for error keywords.
Bulk replace : run sed -i 's/old/new/g' files to quickly fix multiple occurrences in configuration files.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ops Community
A leading IT operations community where professionals share and grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
