Operations 26 min read

8 Common Shell Script Mistakes Junior Ops Engineers Make (Are You Guilty?)

This article examines the eight most frequent errors junior and mid‑level Linux operations engineers make when writing Bash scripts—such as missing quotes, wrong comparison operators, incomplete file checks, ignoring return codes, mishandling spaces, concurrency issues, lack of error handling, and absent logging—and provides concrete examples, detailed analysis, and corrected code snippets to improve script reliability and maintainability.

Ops Community

Apr 26, 2026

8 Common Shell Script Mistakes Junior Ops Engineers Make (Are You Guilty?)

Background and Problem

Shell scripts are a core tool for Linux operations engineers, used for everything from simple log cleanup to complex automated deployments. Many junior and mid‑level engineers write scripts that work in happy paths but fail under edge cases, often due to subtle mistakes.

Mistake 1: Missing Quotes Around Variable References

Incorrect Example

#!/bin/bash
# Variable without quotes – spaces break the command
filename="my document.txt"
ls -la $filename
rm $filename
cat $filename

today=$(date +%Y-%m-%d)
# Command substitution result without quotes
tar -czf backup-$today.tar.gz /data

# Test expression without quotes
name="John Smith"
if [ $name = "John" ]; then
  echo "Found John"
fi

Analysis

When a variable contains spaces, Bash splits it into separate words, causing commands to receive wrong arguments. For example, ls -la $filename becomes ls -la my document.txt, which Bash interprets as two separate arguments ( my and document.txt).

Correct Approach

#!/bin/bash
# Always quote variable expansions
filename="my document.txt"
ls -la "$filename"
rm "$filename"
cat "$filename"

today=$(date +%Y-%m-%d)
# Quote the result in the command
tar -czf "backup-${today}.tar.gz" /data

# Quote variables inside test
name="John Smith"
if [ "$name" = "John" ]; then
  echo "Found John"
fi

Special Cases

# Double quotes allow variable and command substitution
name="world"
echo "Hello $name"   # -> Hello world

echo 'Hello $name'   # -> Hello $name (no substitution)

Mistake 2: Using Wrong Operators for String Comparison

Incorrect Example

#!/bin/bash
# Using = for string comparison (POSIX‑non‑compliant)
name="abc"
if [ $name = "abc" ]; then echo "match"; fi
# Using == inside [] (non‑POSIX)
if [ $name == "abc" ]; then echo "match"; fi
# Using = for numeric comparison (wrong)
count=10
if [ $count = 10 ]; then echo "count is 10"; fi
# Using string operator inside [[ ]] for numbers (incorrect)
num=100
if [[ $num = 100 ]]; then echo "match"; fi

Analysis

Bash provides distinct operators: = and != for strings, -eq, -lt, -gt etc. for numbers. Inside [[ ]], == performs pattern matching, not strict equality.

Correct Approach

#!/bin/bash
# String comparison – use = or == inside [[ ]]
name="abc"
if [ "$name" = "abc" ]; then echo "string match"; fi
if [[ "$name" == "abc" ]]; then echo "match"; fi
# Numeric comparison – use -eq, -gt, etc.
count=10
if [ "$count" -eq 10 ]; then echo "count is 10"; fi
if [ "$count" -gt 5 ]; then echo "count > 5"; fi
# Pattern matching inside [[ ]]
filename="report.txt"
if [[ "$filename" == *.txt ]]; then echo "text file"; fi

Mistake 3: Incomplete File Existence Checks

Incorrect Example

#!/bin/bash
# Only checks existence, not type
if [ -e /path/to/file ]; then cat /path/to/file; fi
# Ignores symlink target
if [ -f /path/to/symlink ]; then cat /path/to/symlink; fi
# Logical errors when combining tests
if [ -f /path/a ] && [ -f /path/b ]; then diff /path/a /path/b; fi
# Variable used without quoting
filepath="/path/to file with spaces"
if [ -f $filepath ]; then echo "exists"; fi

Analysis

-e

only checks existence, -f checks for a regular file, and symlinks need -L or -h. Combining multiple tests requires proper short‑circuit handling.

Correct Approach

#!/bin/bash
filepath="/path/to/file"
# Proper quoting
if [ -f "$filepath" ]; then echo "普通文件存在"; fi
# Check symlink
if [ -L "$filepath" ]; then echo "是符号链接"; realpath=$(readlink -f "$filepath"); if [ -f "$realpath" ]; then echo "链接目标文件存在: $realpath"; else echo "警告：链接断开"; fi; fi
# Directory check
if [ -d "/path/to/dir" ]; then echo "目录存在"; fi
# Multiple files – all must exist
file1="/path/a"
file2="/path/b"
if [ -f "$file1" ] && [ -f "$file2" ]; then diff "$file1" "$file2"; fi
# Any one file exists
if [ -f "$file1" ] || [ -f "$file2" ]; then echo "至少一个文件存在"; fi
# Permission checks
if [ -r "$filepath" ]; then echo "文件可读"; fi
if [ -w "$filepath" ]; then echo "文件可写"; fi
if [ -x "$filepath" ]; then echo "文件可执行"; fi
# Safe file‑read function
safe_cat() {
  local file="$1"
  if [ -f "$file" ] && [ -r "$file" ]; then cat "$file"; else echo "Error: Cannot read file: $file" >&2; return 1; fi
}

Mistake 4: Ignoring Command Return Values

Incorrect Example

#!/bin/bash
# No checks after cd
cd /some/directory
ls -la
# Pipeline only checks last command
grep "error" /var/log/app.log | cut -d ' ' -f1 | sort | uniq
# Test without checking
[ -f /path/to/file ]
echo "File check completed"
# Function without return status
function do_something() { # some ops }
# Variable assignment without checking command success
files=$(ls /nonexistent_directory)
echo "Files: $files"

Analysis

Every command exits with a status (0 = success). Ignoring these statuses can hide failures and make debugging hard.

Correct Approach

#!/bin/bash
# Abort on any error
cd /some/directory || { echo "Failed to change directory" >&2; exit 1; }
set -e
# Use set -e, -u, -o pipefail for strict mode
set -euo pipefail
# Check grep result explicitly
if grep -q "error" /var/log/app.log; then echo "Found errors in log"; else echo "No errors found"; fi
# Proper function return handling
function do_something() {
  # some operations
  if [ $? -ne 0 ]; then return 1; fi
  return 0
}
if ! do_something; then echo "Operation failed"; exit 1; fi
# Capture output while checking status
if ! output=$(ls /nonexistent 2>&1); then echo "Command failed: $output" >&2; fi
# Common patterns
command_that_might_fail || { echo "Failed"; exit 1; }
command_that_might_fail && echo "Success" || echo "Failed"

Mistake 5: Not Handling Spaces and Special Characters

Incorrect Example

#!/bin/bash
# Loop over ls output – breaks on spaces
for file in $(ls *.txt); do cat "$file"; done
# Read while losing spaces
cat file.txt | while read line; do echo "$line"; done
# Array with spaces treated as separate words
names="John Jane Bob"
for name in $names; do echo "$name"; done
# Path with spaces
path="/path/with spaces/file"
ls -la $path
# User input with spaces
read -p "Enter name: " name
echo "Hello $name"

Analysis

Bash performs word splitting based on IFS. Unquoted variables containing spaces are split, leading to wrong arguments or loop iterations.

Correct Approach

#!/bin/bash
# Use arrays or proper quoting
files=(*.txt)
for file in "${files[@]}"; do cat "$file"; done
# Find with -print0 and read -d ''
while IFS= read -r -d '' file; do echo "Processing: $file"; cat "$file"; done < <(find . -name "*.txt" -print0)
# Proper array handling
names=("John Smith" "Jane Doe" "Bob Anderson")
for name in "${names[@]}"; do echo "Name: $name"; done
# Path with spaces – always quote
path="/path/with spaces/file"
ls -la "$path"
# Safe user input – strip possible quotes
read -p "Enter name: " name
echo "Hello, ${name//\"/}"

Mistake 6: Ignoring Concurrency

Incorrect Example

#!/bin/bash
# Multiple processes write to same file
for i in {1..10}; do echo "Process $i" >> /tmp/shared.txt; done
# Counter updated concurrently – lost updates
counter=0
for i in {1..100}; do counter=$((counter + 1)); done
echo $counter  # may not be 100
# PID file not checked – duplicate starts
if [ -f /var/run/myapp.pid ]; then echo "Already running"; exit 1; fi
echo $$ > /var/run/myapp.pid
# Temporary file conflict
for i in {1..10}; do sort data.txt > /tmp/sorted.txt; done

Analysis

Concurrent writes cause race conditions, data loss, and duplicate executions. Proper locking, PID checks, and isolated temporary files are required.

Correct Approach

#!/bin/bash
# Lock file method
LOCKFILE="/var/run/myapp.lock"
if ! mkdir "$LOCKFILE" 2>/dev/null; then echo "Another instance is running"; exit 1; fi
trap 'rmdir "$LOCKFILE"' EXIT
# flock (recommended)
exec 200>"/var/run/myapp.lock"
if ! flock -n 200; then echo "Another instance is running"; exit 1; fi
# PID file with real process check
PIDFILE="/var/run/myapp.pid"
if [ -f "$PIDFILE" ]; then
  oldpid=$(cat "$PIDFILE")
  if kill -0 "$oldpid" 2>/dev/null; then echo "Already running with PID $oldpid"; exit 1; fi
  rm -f "$PIDFILE"
fi
echo $$ > "$PIDFILE"
trap 'rm -f "$PIDFILE"' EXIT
# Parallel processing with separate temp files
tmpdir=$(mktemp -d)
for i in {1..10}; do
  (sort "data_${i}.txt" > "$tmpdir/sorted_${i}.txt") &
 done
wait
cat "$tmpdir"/sorted_*.txt | sort -m > /tmp/final.txt
rm -rf "$tmpdir"
# Semaphore example – limit to 5 concurrent jobs
semaphore() {
  local max_jobs=$1; shift
  local jobs=()
  for cmd in "$@"; do
    while [ "${#jobs[@]}" -ge $max_jobs ]; do
      wait -n
      jobs=($(jobs -p))
    done
    eval "$cmd" &
    jobs+=($!)
  done
  wait
}
semaphore 5 "task1.sh &" "task2.sh &" "task3.sh &" "task4.sh &" "task5.sh &"

Mistake 7: No Error Handling

Incorrect Example

#!/bin/bash
# Continue after failures
cd /nonexistent
echo "Current directory: $(pwd)"
# No verification of critical resources
cp /data/important.db /backup/
rm -rf /data/
# No network check
curl -s http://api.example.com/data > /tmp/data.json
# Blind delete
rm -rf /tmp/build/*
# Function without error handling
function do_critical_work() { some_command }

Analysis

Without proper error handling, any unexpected failure can corrupt data or leave the system in an inconsistent state.

Correct Approach

#!/bin/bash
set -euo pipefail
# Central error handler
error_exit() { echo "ERROR: $1" >&2; exit ${2:-1}; }
# Verify resources before proceeding
verify_resources() {
  [ -d "/data" ] || error_exit "Directory /data does not exist"
  [ -d "/backup" ] || error_exit "Directory /backup does not exist"
  available=$(df -k /backup | awk 'NR==2 {print $4}')
  [ "$available" -gt 1000000 ] || error_exit "Insufficient disk space"
  curl -s --max-time 5 http://api.example.com/health > /dev/null || error_exit "Cannot connect to API"
}
# Safe backup with verification
safe_backup() {
  local src="/data/important.db" dst="/backup/important.db"
  [ -f "$src" ] || error_exit "Source file not found: $src"
  local tmp_dst="${dst}.tmp.$$"
  cp "$src" "$tmp_dst" || error_exit "Backup copy failed"
  cmp -s "$src" "$tmp_dst" || { rm -f "$tmp_dst"; error_exit "Backup verification failed"; }
  mv "$tmp_dst" "$dst" || { rm -f "$tmp_dst"; error_exit "Backup move failed"; }
  echo "Backup successful: $dst"
}
# Signal handling and cleanup
cleanup() { echo "Cleaning up..."; rm -rf /tmp/myapp_*; [ -f /var/run/myapp.pid ] && kill $(cat /var/run/myapp.pid) 2>/dev/null || true; }
trap cleanup EXIT INT TERM
# Main flow
verify_resources
safe_backup

Mistake 8: No Logging

Incorrect Example

#!/bin/bash
# All output goes to stdout
echo "Starting process"
some_command
echo "Process completed"
# No error logging
error_command
# No timestamps
long_running_task
echo "Done"
# Scattered echo redirections
echo "Starting" > /tmp/process.log
echo "Continuing" >> /tmp/process.log
echo "Finished" >> /tmp/process.log
# No log rotation – file grows indefinitely
while true; do echo "$(date): heartbeat" >> /var/log/myapp.log; sleep 60; done

Analysis

Without structured logging, troubleshooting becomes difficult, and logs can become unwieldy.

Correct Approach

#!/bin/bash
# Logging configuration
LOGFILE="/var/log/myapp/app.log"
LOGLEVEL="INFO"   # DEBUG, INFO, WARN, ERROR
declare -A LOG_LEVELS=([DEBUG]=0 [INFO]=1 [WARN]=2 [ERROR]=3)
log() {
  local level="${1:-INFO}"; shift
  local message="$*"
  local timestamp=$(date '+%Y-%m-%d %H:%M:%S')
  local current_level=${LOG_LEVELS[$LOGLEVEL]}
  local message_level=${LOG_LEVELS[$level]}
  [ "$message_level" -lt "$current_level" ] && return
  echo "[$timestamp] [$level] $message" >> "$LOGFILE"
  [ "$message_level" -ge ${LOG_LEVELS[INFO]} ] && echo "[$level] $message"
}
log_debug() { log DEBUG "$@"; }
log_info()  { log INFO "$@"; }
log_warn()  { log WARN "$@"; }
log_error() { log ERROR "$@"; }
# Example usage
log_info "Script started"
log_debug "Configuration: debug=true"
if ! command_that_might_fail; then log_error "Task failed"; exit 1; fi
log_info "Task completed successfully"
# Log rotation (10 MB limit, keep 7 files)
rotate_log() {
  local logfile="$1"
  local max_size=$((10*1024*1024))
  if [ -f "$logfile" ] && [ $(stat -c%s "$logfile") -gt $max_size ]; then
    mv "$logfile" "${logfile}.$(date +%Y%m%d-%H%M%S)" && gzip "${logfile}.$(date +%Y%m%d-%H%M%S)" &
    ls -1t "${logfile}".*.gz 2>/dev/null | tail -n +8 | xargs rm -f 2>/dev/null || true
  fi
}
init_log() { mkdir -p "$(dirname "$LOGFILE")"; touch "$LOGFILE"; rotate_log "$LOGFILE"; }
init_log

Best‑Practice Checklist

Quote all variable expansions.

Use the correct operators for string ( =, !=) and numeric ( -eq, -lt, …) comparisons.

Perform complete file checks: existence, type, permissions, and symlink resolution.

Check return values of every critical command.

Handle spaces and special characters by quoting and using arrays or find -print0.

Guard against concurrent execution with lock files, flock, or PID checks.

Implement robust error handling, custom exit functions, and cleanup traps.

Use a structured logging system with levels, timestamps, and rotation.

concurrency logging Error Handling bash shell scripting file checks

Written by

Ops Community

A leading IT operations community where professionals share and grow together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.