15 Essential Shell Scripts to Supercharge Your Sysadmin Tasks
A comprehensive guide for system administrators that presents 15 practical Bash scripts—covering resource monitoring, process checks, network surveillance, log analysis, batch operations, backups, security hardening, performance tuning, Docker cleanup, and log rotation—to automate repetitive tasks and boost operational efficiency.
Shell Script Practical: 15 Automation Scripts for Sysadmins
As a system administrator, repetitive tasks can be automated with Bash scripts. Below are 15 proven scripts covering monitoring, process checks, network monitoring, log analysis, batch operations, backups, security hardening, performance tuning, Docker cleanup, and log rotation.
1. System Monitoring Script
Monitors CPU, memory, and disk usage and sends alerts when thresholds are exceeded.
#!/bin/bash
# resource_monitor.sh - Server resource monitoring script
CPU_THRESHOLD=80
MEM_THRESHOLD=85
DISK_THRESHOLD=90
check_resources() {
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)
MEM_TOTAL=$(free -m | awk 'NR==2{print $2}')
MEM_USED=$(free -m | awk 'NR==2{print $3}')
MEM_USAGE=$(echo "scale=2; $MEM_USED*100/$MEM_TOTAL" | bc)
DISK_USAGE=$(df -h / | awk 'NR==2{print $5}' | cut -d'%' -f1)
echo "[$(date '+%Y-%m-%d %H:%M:%S')] CPU: $CPU_USAGE%, MEM: $MEM_USAGE%, DISK: $DISK_USAGE%" >> /var/log/resource_monitor.log
if (( $(echo "$CPU_USAGE > $CPU_THRESHOLD" | bc -l) )); then
send_alert "CPU usage alert" "Current CPU usage: $CPU_USAGE%"
fi
if (( $(echo "$MEM_USAGE > $MEM_THRESHOLD" | bc -l) )); then
send_alert "Memory usage alert" "Current memory usage: $MEM_USAGE%"
fi
if [ "$DISK_USAGE" -gt "$DISK_THRESHOLD" ]; then
send_alert "Disk usage alert" "Current disk usage: $DISK_USAGE%"
fi
}
send_alert() {
local subject=$1
local message=$2
echo "$message" | mail -s "$subject - $(hostname)" [email protected]
}
while true; do
check_resources
sleep 60
done2. Process Liveness Detection Script
Automatically checks critical processes and restarts them if they crash.
#!/bin/bash
# process_guard.sh - Process monitoring script
declare -A PROCESSES=(
["nginx"]="/usr/sbin/nginx"
["mysql"]="systemctl start mysql"
["redis"]="/usr/bin/redis-server /etc/redis/redis.conf"
)
check_process() {
local process_name=$1
local start_command=$2
if ! pgrep -x "$process_name" > /dev/null; then
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $process_name is not running, attempting to restart..."
eval "$start_command"
sleep 5
if pgrep -x "$process_name" > /dev/null; then
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $process_name restarted successfully"
echo "$process_name has been restarted on $(hostname)" | mail -s "Process Recovered: $process_name" [email protected]
else
echo "[$(date '+%Y-%m-%d %H:%M:%S')] Failed to restart $process_name"
echo "Failed to restart $process_name on $(hostname)" | mail -s "CRITICAL: Process Down - $process_name" [email protected]
fi
fi
}
while true; do
for process in "${!PROCESSES[@]}"; do
check_process "$process" "${PROCESSES[$process]}"
done
sleep 30
done3. Network Connection Monitoring Script
Monitors active network connections and alerts on abnormal activity.
#!/bin/bash
# network_monitor.sh - Network connection monitoring
MAX_CONNECTIONS=1000
SUSPICIOUS_PORT_THRESHOLD=100
monitor_connections() {
total_conn=$(netstat -an | grep ESTABLISHED | wc -l)
netstat -an | grep ESTABLISHED | awk '{print $4}' | cut -d: -fNF | sort | uniq -c | sort -rn | head -20 >> /tmp/network_report.txt
netstat -an | grep ESTABLISHED | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -rn | head -10 >> /tmp/network_report.txt
if [ "$total_conn" -gt "$MAX_CONNECTIONS" ]; then
echo "[ALERT] Connection count exceeds threshold: $total_conn" >> /tmp/network_report.txt
cat /tmp/network_report.txt | mail -s "Network Connection Alert - $(hostname)" [email protected]
fi
echo "[${date '+%Y-%m-%d %H:%M:%S'}] Total connections: $total_conn" >> /var/log/network_monitor.log
}
while true; do
monitor_connections
sleep 300
done4. Web Access Log Analysis Script
Analyzes Nginx/Apache access logs to produce PV, UV, top URLs, IPs, status codes, and slow requests.
#!/bin/bash
# web_log_analyzer.sh - Web log analysis tool
LOG_FILE="/var/log/nginx/access.log"
REPORT_FILE="/tmp/web_analysis_$(date +%Y%m%d).txt"
analyze_log() {
echo "=== Web Access Log Analysis Report ===" > "$REPORT_FILE"
echo "Analysis Time: $(date '+%Y-%m-%d %H:%M:%S')" >> "$REPORT_FILE"
echo "Log File: $LOG_FILE" >> "$REPORT_FILE"
echo >> "$REPORT_FILE"
pv=$(wc -l < "$LOG_FILE")
echo "Total PV: $pv" >> "$REPORT_FILE"
uv=$(awk '{print $1}' "$LOG_FILE" | sort -u | wc -l)
echo "Unique Visitors (UV): $uv" >> "$REPORT_FILE"
echo -e "
=== Top 10 URLs ===" >> "$REPORT_FILE"
awk '{print $7}' "$LOG_FILE" | sort | uniq -c | sort -rn | head -10 >> "$REPORT_FILE"
echo -e "
=== Top 10 IPs ===" >> "$REPORT_FILE"
awk '{print $1}' "$LOG_FILE" | sort | uniq -c | sort -rn | head -10 >> "$REPORT_FILE"
echo -e "
=== HTTP Status Code Distribution ===" >> "$REPORT_FILE"
awk '{print $9}' "$LOG_FILE" | sort | uniq -c | sort -rn >> "$REPORT_FILE"
echo -e "
=== Slow Requests (>1s) ===" >> "$REPORT_FILE"
awk '$10 > 1000 {print $7, $10"ms"}' "$LOG_FILE" | sort -k2 -rn | head -10 >> "$REPORT_FILE"
echo -e "
=== Hourly Traffic Trend ===" >> "$REPORT_FILE"
awk '{print substr($4,14,2)}' "$LOG_FILE" | sort | uniq -c | sort -k2 -n >> "$REPORT_FILE"
}
analyze_log
cat "$REPORT_FILE"5. Error Log Intelligent Analysis Script
Extracts and categorizes system error logs for quick troubleshooting.
#!/bin/bash
# error_log_analyzer.sh - Intelligent error log analysis
LOG_FILES=(
"/var/log/syslog"
"/var/log/messages"
"/var/log/nginx/error.log"
"/var/log/mysql/error.log"
)
ERROR_KEYWORDS="error|fail|fatal|critical|alert|emergency|warning"
analyze_errors() {
local log_file=$1
local output_file="/tmp/error_analysis_$(basename $log_file).txt"
echo "=== Error Log Analysis: $log_file ===" > "$output_file"
echo "Analysis Time: $(date '+%Y-%m-%d %H:%M:%S')" >> "$output_file"
error_count=$(grep -iE "$ERROR_KEYWORDS" "$log_file" 2>/dev/null | wc -l)
echo "Total Errors: $error_count" >> "$output_file"
echo -e "
=== Error Type Distribution ===" >> "$output_file"
grep -iE "$ERROR_KEYWORDS" "$log_file" 2>/dev/null | grep -oiE "$ERROR_KEYWORDS" | tr '[:upper:]' '[:lower:]' | sort | uniq -c | sort -rn >> "$output_file"
echo -e "
=== Recent 10 Errors ===" >> "$output_file"
grep -iE "$ERROR_KEYWORDS" "$log_file" 2>/dev/null | tail -10 >> "$output_file"
echo -e "
=== Error Time Distribution (Last 24h) ===" >> "$output_file"
grep -iE "$ERROR_KEYWORDS" "$log_file" 2>/dev/null | awk '{print $1, $2, $3}' | cut -d: -f1 | sort | uniq -c >> "$output_file"
echo $error_count
}
total_errors=0
for log_file in "${LOG_FILES[@]}"; do
if [ -f "$log_file" ]; then
errors=$(analyze_errors "$log_file")
total_errors=$((total_errors + errors))
fi
done
if [ $total_errors -gt 100 ]; then
echo "Detected a large number of error logs: $total_errors errors" | mail -s "Error Log Alert - $(hostname)" [email protected]
fi6. Batch Server Health Check Script
Runs health checks on multiple servers in parallel.
#!/bin/bash
# batch_health_check.sh - Batch server inspection
SERVERS=("192.168.1.10" "192.168.1.11" "192.168.1.12")
SSH_USER="admin"
SSH_PORT=22
SSH_KEY="~/.ssh/id_rsa"
health_check() {
local server=$1
ssh -p $SSH_PORT -i $SSH_KEY -o ConnectTimeout=5 $SSH_USER@$server <<'EOF'
echo "Hostname: $(hostname)"
echo "System Time: $(date)"
echo "Uptime: $(uptime)"
echo ""
echo "CPU Usage:"
top -bn1 | head -5
echo ""
echo "Memory Usage:"
free -h
echo ""
echo "Disk Usage:"
df -h
echo ""
echo "Network Connections:"
netstat -tunlp 2>/dev/null | head -10
echo ""
echo "Recent Logins:"
last -n 5
EOF
if [ $? -eq 0 ]; then
echo "[$server] Inspection completed ✓"
else
echo "[$server] Inspection failed ✗"
fi
echo "-----------------------------------"
}
for server in "${SERVERS[@]}"; do
health_check "$server" &
done
wait
echo "All server inspections completed!"7. Batch File Distribution Script
Distributes a file to multiple servers using rsync.
#!/bin/bash
# file_distribution.sh - Batch file distribution tool
SOURCE_FILE="$1"
DEST_PATH="$2"
SERVERS_FILE="servers.txt"
if [ $# -ne 2 ]; then
echo "Usage: $0 <source_file> <destination_path>"
exit 1
fi
if [ ! -f "$SOURCE_FILE" ]; then
echo "Error: Source file does not exist"
exit 1
fi
distribute_file() {
local server=$1
local source=$2
local dest=$3
echo -n "Distributing to $server ... "
rsync -av --delete "$source" "root@$server:$dest" 2>/dev/null
if [ $? -eq 0 ]; then
echo "Success ✓"
return 0
else
echo "Failed ✗"
return 1
fi
}
success_count=0
fail_count=0
failed_servers=()
while IFS= read -r server; do
[[ -z "$server" || "$server" =~ ^# ]] && continue
distribute_file "$server" "$SOURCE_FILE" "$DEST_PATH"
if [ $? -eq 0 ]; then
((success_count++))
else
((fail_count++))
failed_servers+=("$server")
fi
done < "$SERVERS_FILE"
echo "=== Distribution Summary ==="
echo "Success: $success_count"
echo "Failed: $fail_count"
if [ ${#failed_servers[@]} -gt 0 ]; then
echo "Failed servers:"
printf '%s
' "${failed_servers[@]}"
fi8. MySQL Automatic Backup Script
Performs full and incremental backups of MySQL databases and cleans old backups.
#!/bin/bash
# mysql_backup.sh - MySQL automatic backup script
DB_HOST="localhost"
DB_USER="backup_user"
DB_PASS="backup_password"
DB_NAME="your_database"
BACKUP_DIR="/backup/mysql"
RETENTION_DAYS=7
DATE=$(date +%Y%m%d_%H%M%S)
mkdir -p "$BACKUP_DIR"
backup_database() {
local backup_file="$BACKUP_DIR/${DB_NAME}_$DATE.sql.gz"
echo "Starting backup of database: $DB_NAME"
mysqldump --host="$DB_HOST" --user="$DB_USER" --password="$DB_PASS" --single-transaction --routines --triggers --events --default-character-set=utf8mb4 "$DB_NAME" | gzip > "$backup_file"
if [ $? -eq 0 ]; then
echo "Backup successful: $backup_file"
echo "File size: $(du -h "$backup_file" | cut -f1)"
if [ -s "$backup_file" ]; then
echo "Backup file verification passed"
else
echo "Error: Backup file is empty"
exit 1
fi
else
echo "Backup failed"
exit 1
fi
}
cleanup_old_backups() {
echo "Cleaning backups older than $RETENTION_DAYS days..."
find "$BACKUP_DIR" -name "${DB_NAME}_*.sql.gz" -mtime +$RETENTION_DAYS -delete
echo "Cleanup complete"
}
generate_report() {
echo ""
echo "=== Backup Report ==="
echo "Backup Time: $(date)"
echo "Database: $DB_NAME"
echo "Backup File: ${DB_NAME}_$DATE.sql.gz"
echo "Current Backup List:"
ls -lh "$BACKUP_DIR"/${DB_NAME}_*.sql.gz 2>/dev/null | tail -5
}
backup_database
cleanup_old_backups
generate_report9. Configuration File Version Management Script
Tracks changes to configuration files using Git and supports rollback.
#!/bin/bash
# config_version_control.sh - Configuration file version management
CONFIG_DIR="/etc/nginx"
BACKUP_DIR="/backup/configs"
GIT_REPO="$BACKUP_DIR/git_repo"
init_repo() {
if [ ! -d "$GIT_REPO/.git" ]; then
mkdir -p "$GIT_REPO"
cd "$GIT_REPO"
git init
echo "Git repository initialized"
fi
}
backup_configs() {
local timestamp=$(date +%Y%m%d_%H%M%S)
local commit_msg="Backup at $timestamp"
rsync -av --delete "$CONFIG_DIR/" "$GIT_REPO/"
cd "$GIT_REPO"
if [ -n "$(git status --porcelain)" ]; then
git add -A
git commit -m "$commit_msg"
echo "Configuration backed up: $commit_msg"
echo "Changes:"
git diff HEAD~1 --stat
else
echo "No configuration changes"
fi
}
show_history() {
cd "$GIT_REPO"
echo "=== Configuration Change History ==="
git log --oneline -10
}
rollback() {
local version="$1"
if [ -z "$version" ]; then
echo "Please specify a version to rollback"
show_history
return 1
fi
cd "$GIT_REPO"
backup_configs
git checkout "$version" .
rsync -av "$GIT_REPO/" "$CONFIG_DIR/"
echo "Rolled back to version: $version"
systemctl reload nginx
}
case "$1" in
init) init_repo ;;
backup) backup_configs ;;
history) show_history ;;
rollback) rollback "$2" ;;
*) echo "Usage: $0 {init|backup|history|rollback <version>}"; exit 1 ;;
esac10. System Security Baseline Check Script
Performs comprehensive security checks and generates a report with a score.
#!/bin/bash
# security_baseline_check.sh - System security baseline check
REPORT_FILE="/tmp/security_report_$(date +%Y%m%d).txt"
SCORE=100
echo "=== System Security Baseline Check Report ===" > "$REPORT_FILE"
echo "Check Time: $(date)" >> "$REPORT_FILE"
echo "Hostname: $(hostname)" >> "$REPORT_FILE"
echo "" >> "$REPORT_FILE"
check_ssh() {
echo "## SSH Security Check" >> "$REPORT_FILE"
if grep -q "^PermitRootLogin no" /etc/ssh/sshd_config; then
echo "✓ Root SSH login disabled" >> "$REPORT_FILE"
else
echo "✗ Root SSH login not disabled (-5)" >> "$REPORT_FILE"
SCORE=$((SCORE-5))
fi
if grep -q "^PasswordAuthentication no" /etc/ssh/sshd_config; then
echo "✓ SSH password authentication disabled" >> "$REPORT_FILE"
else
echo "✗ SSH password authentication not disabled (-3)" >> "$REPORT_FILE"
SCORE=$((SCORE-3))
fi
ssh_port=$(grep "^Port" /etc/ssh/sshd_config | awk '{print $2}')
if [ "$ssh_port" != "22" ]; then
echo "✓ SSH port changed: $ssh_port" >> "$REPORT_FILE"
else
echo "⚠ Default SSH port 22 (-2)" >> "$REPORT_FILE"
SCORE=$((SCORE-2))
fi
}
check_password_policy() {
echo -e "
## Password Policy Check" >> "$REPORT_FILE"
if [ -f /etc/pam.d/common-password ]; then
if grep -q "pam_pwquality.so" /etc/pam.d/common-password; then
echo "✓ Password complexity enabled" >> "$REPORT_FILE"
else
echo "✗ Password complexity not enabled (-5)" >> "$REPORT_FILE"
SCORE=$((SCORE-5))
fi
fi
pass_max_days=$(grep "^PASS_MAX_DAYS" /etc/login.defs | awk '{print $2}')
if [ "$pass_max_days" -le 90 ]; then
echo "✓ Password max age: $pass_max_days days" >> "$REPORT_FILE"
else
echo "✗ Password max age too long: $pass_max_days days (-3)" >> "$REPORT_FILE"
SCORE=$((SCORE-3))
fi
}
check_firewall() {
echo -e "
## Firewall Check" >> "$REPORT_FILE"
if systemctl is-active firewalld >/dev/null 2>&1; then
echo "✓ Firewalld enabled" >> "$REPORT_FILE"
elif iptables -L -n | grep -q "Chain INPUT"; then
echo "✓ Iptables firewall configured" >> "$REPORT_FILE"
else
echo "✗ Firewall not enabled (-10)" >> "$REPORT_FILE"
SCORE=$((SCORE-10))
fi
}
check_updates() {
echo -e "
## System Updates Check" >> "$REPORT_FILE"
if command -v yum >/dev/null 2>&1; then
updates=$(yum check-update --quiet | wc -l)
elif command -v apt >/dev/null 2>&1; then
apt update >/dev/null 2>&1
updates=$(apt list --upgradable 2>/dev/null | wc -l)
fi
if [ "$updates" -gt 10 ]; then
echo "✗ $updates packages pending updates (-5)" >> "$REPORT_FILE"
SCORE=$((SCORE-5))
else
echo "✓ System update status good" >> "$REPORT_FILE"
fi
}
check_dangerous_services() {
echo -e "
## Dangerous Services Check" >> "$REPORT_FILE"
dangerous_services=("telnet" "rsh" "rlogin" "vsftpd")
for service in "${dangerous_services[@]}"; do
if systemctl is-active "$service" >/dev/null 2>&1; then
echo "✗ Dangerous service $service running (-5)" >> "$REPORT_FILE"
SCORE=$((SCORE-5))
fi
done
echo "✓ Dangerous services check completed" >> "$REPORT_FILE"
}
check_ssh
check_password_policy
check_firewall
check_updates
check_dangerous_services
echo -e "
## Security Score: $SCORE/100" >> "$REPORT_FILE"
if [ $SCORE -ge 90 ]; then
echo "Security Level: Excellent" >> "$REPORT_FILE"
elif [ $SCORE -ge 70 ]; then
echo "Security Level: Good" >> "$REPORT_FILE"
elif [ $SCORE -ge 60 ]; then
echo "Security Level: Pass" >> "$REPORT_FILE"
else
echo "Security Level: Needs Immediate Improvement" >> "$REPORT_FILE"
fi
cat "$REPORT_FILE"11. IP Firewall Automatic Blocking Script
Detects malicious IPs from Nginx access logs and blocks them with iptables.
#!/bin/bash
# auto_ban_ip.sh - Automatic IP blocking script
LOG_FILE="/var/log/nginx/access.log"
BAN_TIME=3600
MAX_ATTEMPTS=50
TIME_WINDOW=60
WHITELIST="/etc/whitelist.txt"
BLACKLIST="/etc/blacklist.txt"
detect_malicious_ip() {
tail -n 1000 "$LOG_FILE" | awk '{print $1}' | sort | uniq -c | while read count ip; do
grep -q "^$ip$" "$WHITELIST" 2>/dev/null && continue
if [ "$count" -gt "$MAX_ATTEMPTS" ]; then
ban_ip "$ip" "$count"
fi
done
}
ban_ip() {
local ip=$1
local count=$2
echo "Blocking IP: $ip (Attempts: $count)"
iptables -I INPUT -s "$ip" -j DROP
echo "$(date '+%Y-%m-%d %H:%M:%S') $ip $count" >> "$BLACKLIST"
echo "iptables -D INPUT -s $ip -j DROP" | at now + $((BAN_TIME/60)) minutes 2>/dev/null
echo "IP $ip has been automatically blocked (Attempts: $count)" | mail -s "Security Alert: IP Blocked - $(hostname)" [email protected]
}
show_banned_ips() {
echo "=== Currently Banned IPs ==="
iptables -L INPUT -n | grep DROP | awk '{print $4}'
}
case "${1:-detect}" in
detect)
while true; do
detect_malicious_ip
sleep 60
done
;;
list) show_banned_ips ;;
unban) unban_ip "$2" ;;
*) echo "Usage: $0 {detect|list|unban <ip>}"; exit 1 ;;
esac12. System Performance Tuning Script
Optimizes kernel parameters, system limits, network stack, and disk I/O for better performance.
#!/bin/bash
# system_tuning.sh - Automatic system performance tuning
backup_configs() {
cp /etc/sysctl.conf /etc/sysctl.conf.bak.$(date +%Y%m%d)
cp /etc/security/limits.conf /etc/security/limits.conf.bak.$(date +%Y%m%d)
echo "Configuration files backed up"
}
optimize_kernel() {
echo "Optimizing kernel parameters..."
cat >> /etc/sysctl.conf <<EOF
# === Performance Optimization Parameters ===
# Network tuning
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 0
net.ipv4.ip_local_port_range = 10000 65000
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_max_tw_buckets = 5000
# Filesystem tuning
fs.file-max = 655350
fs.nr_open = 1048576
# Memory tuning
vm.swappiness = 10
vm.dirty_ratio = 15
vm.dirty_background_ratio = 5
# Connection tracking
net.netfilter.nf_conntrack_max = 262144
net.ipv4.netfilter.ip_conntrack_tcp_timeout_established = 3600
EOF
sysctl -p
echo "Kernel parameter optimization complete"
}
optimize_limits() {
echo "Optimizing system limits..."
cat >> /etc/security/limits.conf <<EOF
# === System Limits Optimization ===
* soft nofile 65535
* hard nofile 65535
* soft nproc 32768
* hard nproc 32768
* soft memlock unlimited
* hard memlock unlimited
EOF
echo "System limits optimization complete"
}
optimize_network() {
echo "Optimizing network stack..."
if modprobe tcp_bbr 2>/dev/null; then
echo "net.core.default_qdisc = fq" >> /etc/sysctl.conf
echo "net.ipv4.tcp_congestion_control = bbr" >> /etc/sysctl.conf
sysctl -p
echo "BBR enabled"
fi
for interface in $(ls /sys/class/net/ | grep -v lo); do
ethtool -G $interface rx 4096 tx 4096 2>/dev/null
ethtool -K $interface gso on gro on tso on 2>/dev/null
done
echo "Network stack optimization complete"
}
optimize_disk() {
echo "Optimizing disk I/O..."
for disk in $(ls /sys/block/ | grep -E '^sd|^nvme'); do
echo noop > /sys/block/$disk/queue/scheduler 2>/dev/null
echo 256 > /sys/block/$disk/queue/nr_requests 2>/dev/null
done
systemctl disable bluetooth 2>/dev/null
systemctl disable cups 2>/dev/null
echo "Disk I/O optimization complete"
}
generate_report() {
echo ""
echo "=== System Performance Optimization Report ==="
echo "Optimization Time: $(date)"
echo ""
echo "Current kernel parameters:"
sysctl -a 2>/dev/null | grep -E "tcp_fin_timeout|tcp_keepalive_time|file-max"
echo ""
echo "Current system limits:"
ulimit -a
echo ""
echo "Network configuration:"
ss -s
}
echo "Starting system performance optimization..."
backup_configs
optimize_kernel
optimize_limits
optimize_network
optimize_disk
generate_report
echo ""
echo "Optimization complete! It is recommended to reboot the system for all changes to take effect."13. Docker Container Cleanup Script
Removes stopped containers, dangling images, unused images, volumes, and networks, and reports disk usage before and after cleanup.
#!/bin/bash
# docker_cleanup.sh - Docker resource cleanup script
KEEP_DAYS=7
LOG_FILE="/var/log/docker_cleanup.log"
log_message() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" | tee -a "$LOG_FILE"
}
cleanup_containers() {
log_message "Starting cleanup of stopped containers..."
stopped_containers=$(docker ps -a -q -f status=exited)
if [ -n "$stopped_containers" ]; then
container_count=$(echo "$stopped_containers" | wc -l)
log_message "Found $container_count stopped containers"
docker rm $stopped_containers
log_message "Stopped containers cleaned"
else
log_message "No containers to clean"
fi
}
cleanup_dangling_images() {
log_message "Starting cleanup of dangling images..."
dangling_images=$(docker images -q -f dangling=true)
if [ -n "$dangling_images" ]; then
image_count=$(echo "$dangling_images" | wc -l)
log_message "Found $image_count dangling images"
docker rmi $dangling_images
log_message "Dangling images cleaned"
else
log_message "No dangling images"
fi
}
cleanup_unused_images() {
log_message "Starting cleanup of unused images..."
unused_images=$(docker images --format "{{.ID}}:{{.Repository}}:{{.Tag}}" | while IFS=: read id repo tag; do
if [ "$repo" != "<none>" ] && [ "$tag" != "<none>" ]; then
if ! docker ps -a --format "{{.Image}}" | grep -q "$repo:$tag"; then
echo "$id"
fi
fi
done)
if [ -n "$unused_images" ]; then
image_count=$(echo "$unused_images" | wc -l)
log_message "Found $image_count unused images"
echo "$unused_images" | xargs docker rmi 2>/dev/null
log_message "Unused images cleaned"
else
log_message "No unused images"
fi
}
cleanup_volumes() {
log_message "Starting cleanup of unused volumes..."
docker volume prune -f
log_message "Unused volumes cleaned"
}
cleanup_networks() {
log_message "Starting cleanup of unused networks..."
docker network prune -f
log_message "Unused networks cleaned"
}
show_disk_usage() {
echo ""
echo "=== Docker Disk Usage ==="
docker system df
echo ""
echo "=== System Disk Usage ==="
df -h /var/lib/docker
}
log_message "========== Starting Docker Cleanup =========="
echo "Before cleanup:"
show_disk_usage
cleanup_containers
cleanup_dangling_images
cleanup_unused_images
cleanup_volumes
cleanup_networks
echo ""
echo "After cleanup:"
show_disk_usage
log_message "========== Docker Cleanup Completed =========="14. Log Rotation and Compression Script
Rotates large log files, compresses old logs, removes logs older than a set age, and monitors disk usage.
#!/bin/bash
# log_rotation.sh - Log rotation and compression management
LOG_DIRS=("/var/log/nginx" "/var/log/mysql" "/var/log/application")
MAX_SIZE="100M"
MAX_AGE=30
COMPRESS_AGE=1
rotate_log() {
local log_file=$1
local timestamp=$(date +%Y%m%d_%H%M%S)
local rotated_file="${log_file}.${timestamp}"
if [ -f "$log_file" ]; then
size=$(du -h "$log_file" | cut -f1)
mv "$log_file" "$rotated_file"
touch "$log_file"
if [[ "$log_file" == *nginx* ]]; then
nginx -s reopen 2>/dev/null
elif [[ "$log_file" == *mysql* ]]; then
mysqladmin flush-logs 2>/dev/null
fi
echo "Rotated log: $log_file -> $rotated_file (size: $size)"
gzip "$rotated_file"
echo "Compressed: ${rotated_file}.gz"
fi
}
compress_old_logs() {
local dir=$1
echo "Compressing old logs in $dir..."
find "$dir" -name "*.log.*" ! -name "*.gz" -mtime +$COMPRESS_AGE -exec gzip {} \;
}
cleanup_old_logs() {
local dir=$1
echo "Cleaning old logs in $dir..."
find "$dir" -name "*.log.*.gz" -type f -mtime +$MAX_AGE -delete
local remaining=$(find "$dir" -name "*.log*" -type f | wc -l)
echo "Remaining log files: $remaining"
}
check_disk_space() {
local dir=$1
local usage=$(df "$dir" | tail -1 | awk '{print $5}' | cut -d% -f1)
if [ $usage -gt 80 ]; then
echo "Warning: $dir disk usage high: $usage%"
find "$dir" -name "*.log.*.gz" -type f -mtime +7 -delete
echo "Performed emergency cleanup"
fi
}
generate_report() {
echo ""
echo "=== Log Management Report ==="
echo "Execution Time: $(date)"
echo ""
for dir in "${LOG_DIRS[@]}"; do
if [ -d "$dir" ]; then
echo "Directory: $dir"
echo "File count: $(find $dir -name "*.log*" -type f | wc -l)"
echo "Total size: $(du -sh $dir | cut -f1)"
echo "---"
fi
done
echo ""
echo "Disk usage:"
df -h | grep -E "^/|Filesystem"
}
echo "Starting log management task..."
for dir in "${LOG_DIRS[@]}"; do
if [ -d "$dir" ]; then
echo "Processing directory: $dir"
check_disk_space "$dir"
find "$dir" -name "*.log" -type f -size +$MAX_SIZE | while read log_file; do
rotate_log "$log_file"
done
compress_old_logs "$dir"
cleanup_old_logs "$dir"
else
echo "Directory does not exist: $dir"
fi
done
generate_report
echo "Log management task completed!"15. Log Rotation and Compression Script (Alternative)
Provides another implementation for rotating, compressing, and cleaning logs.
#!/bin/bash
# log_rotation.sh - Log rotation and compression management (alternative)
LOG_DIRS=("/var/log/nginx" "/var/log/mysql" "/var/log/application")
MAX_SIZE="100M"
MAX_AGE=30
COMPRESS_AGE=1
rotate_log() {
local log_file=$1
local timestamp=$(date +%Y%m%d_%H%M%S)
local rotated_file="${log_file}.${timestamp}"
if [ -f "$log_file" ]; then
size=$(du -h "$log_file" | cut -f1)
mv "$log_file" "$rotated_file"
touch "$log_file"
if [[ "$log_file" == *nginx* ]]; then
nginx -s reopen 2>/dev/null
elif [[ "$log_file" == *mysql* ]]; then
mysqladmin flush-logs 2>/dev/null
fi
echo "Rotated log: $log_file -> $rotated_file (size: $size)"
gzip "$rotated_file"
echo "Compressed: ${rotated_file}.gz"
fi
}
compress_old_logs() {
local dir=$1
echo "Compressing old logs in $dir..."
find "$dir" -name "*.log.*" ! -name "*.gz" -mtime +$COMPRESS_AGE -exec gzip {} \;
}
cleanup_old_logs() {
local dir=$1
echo "Cleaning old logs in $dir..."
find "$dir" -name "*.log.*.gz" -type f -mtime +$MAX_AGE -delete
local remaining=$(find "$dir" -name "*.log*" -type f | wc -l)
echo "Remaining log files: $remaining"
}
check_disk_space() {
local dir=$1
local usage=$(df "$dir" | tail -1 | awk '{print $5}' | cut -d% -f1)
if [ $usage -gt 80 ]; then
echo "Warning: $dir disk usage high: $usage%"
find "$dir" -name "*.log.*.gz" -type f -mtime +7 -delete
echo "Performed emergency cleanup"
fi
}
generate_report() {
echo ""
echo "=== Log Management Report ==="
echo "Execution Time: $(date)"
echo ""
for dir in "${LOG_DIRS[@]}"; do
if [ -d "$dir" ]; then
echo "Directory: $dir"
echo "File count: $(find $dir -name "*.log*" -type f | wc -l)"
echo "Total size: $(du -sh $dir | cut -f1)"
echo "---"
fi
done
echo ""
echo "Disk usage:"
df -h | grep -E "^/|Filesystem"
}
echo "Starting log management task..."
for dir in "${LOG_DIRS[@]}"; do
if [ -d "$dir" ]; then
echo "Processing directory: $dir"
check_disk_space "$dir"
find "$dir" -name "*.log" -type f -size +$MAX_SIZE | while read log_file; do
rotate_log "$log_file"
done
compress_old_logs "$dir"
cleanup_old_logs "$dir"
else
echo "Directory does not exist: $dir"
fi
done
generate_report
echo "Log management task completed!"Conclusion
This collection of 15 Shell scripts covers the most common operational scenarios, from system monitoring and security hardening to backup, performance tuning, Docker cleanup, and log management. By adapting and extending these scripts to fit your environment, you can dramatically reduce manual effort, improve reliability, and free up time for higher‑level engineering work.
Remember, the best sysadmin is not the one who memorizes every command, but the one who automates repetitive work. Use these scripts as a foundation, customize them for your needs, and keep iterating to achieve greater efficiency.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ops Community
A leading IT operations community where professionals share and grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
