Master Essential Linux Shell Tools for Text Processing and Automation
This guide introduces the most commonly used Linux shell utilities—find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk—explaining their core options and providing practical examples to help you efficiently manipulate and process text files from the command line.
Linux shell is a fundamental skill for interacting with the operating system; mastering its text‑processing utilities greatly improves productivity and deepens understanding of Linux internals.
1. find – file search
Search for files by name, type, time, size, permissions, or owner, and perform actions such as delete or exec.
find . \( -name "*.txt" -o -name "*.pdf" \) -print find . -regex ".*\(.txt|\.pdf\)$" find . ! -name "*.txt" -print find . -maxdepth 1 -type f find . -type d -print # list directories only find . -atime 7 -type f -print # accessed in last 7 days find . -type f -size +2k find . -type f -perm 644 -print find . -type f -user weber -print find . -type f -name "*.swp" -delete find . -type f -user root -exec chown weber {} \; find . -type f -mtime +10 -name "*.txt" -exec cp {} OLD \; -exec ./commands.sh {} \;Use -print0 to separate filenames with a null character, handling spaces safely.
2. grep – text search
Search files for matching patterns with a rich set of options.
grep -c "text" filename # count matches grep -n "pattern" file # show line numbers grep -i "pattern" file # case‑insensitive grep -l "pattern" * # list matching file names grep "class" . -R -n # recursive search with line numbers grep -e "class" -e "virtual" file grep "test" file* -lZ | xargs -0 rm # delete files whose names contain "test"3. xargs – build command lines from input
Convert input streams into arguments for other commands.
cat file.txt | xargs cat single.txt | xargs -n 3Key options: -d to define a delimiter (default space, \n for multiline), -n to limit arguments per command, -I {} to replace a placeholder, and -0 to use a null delimiter.
cat file.txt | xargs -I {} ./command.sh -p {} -1 find source_dir/ -type f -name "*.cpp" -print0 | xargs -0 wc -l4. sort – ordering lines
Sort text files numerically, alphabetically, in reverse, or by specific fields.
sort -nrk 1 data.txt sort -bd data # ignore leading blanks5. uniq – remove duplicate lines
Filter adjacent duplicate lines and optionally count occurrences.
sort unsort.txt | uniq sort unsort.txt | uniq -c # count each unique line sort unsort.txt | uniq -d # show only duplicated lines6. tr – translate or delete characters
Common uses include character substitution, deletion, complement, and squeezing repeats.
echo 12345 | tr '0-9' '9876543210' cat text | tr '\t' ' ' cat file | tr -d '0-9' cat file | tr -c '0-9' cat file | tr -d -c '0-9
' cat file | tr -s ' 'Character classes such as [:lower:], [:upper:], [:digit:], etc., can be used: tr '[:lower:]' '[:upper:]'.
7. cut – extract columns
Extract specific fields or characters from each line.
cut -f2,4 filename cut -f3 --complement filename # all but the 3rd column cut -d ";" -f2 filename cut -c1-5 file # first five characters8. paste – merge columns
Combine files line‑by‑line, using a tab or a custom delimiter.
paste file1 file2 paste file1 file2 -d ","9. wc – word, line, and byte count
Quickly obtain file statistics.
wc -l file # lines wc -w file # words wc -c file # bytes10. sed – stream editor for substitutions
Perform in‑place or output‑only text transformations.
sed 's/text/replace_text/' file # replace first occurrence sed 's/text/replace_text/g' file # replace all occurrences sed -i 's/text/replace_text/g' file # edit file in place sed '/^$/d' file # delete empty lines echo "this is an example" | sed 's/\w+/[&]/g'Using double quotes allows variable expansion inside the expression.
11. awk – powerful pattern‑scanning and processing language
Typical script structure: BEGIN{...} { ... } END{...}.
echo -e "line1
line2" | awk 'BEGIN{print "start"} {print} END{print "end"}' awk '{print $2, $3}' file awk 'END{print NR}' file # total number of lines awk 'NR<5' file # first four lines awk -F: '{print $NF}' /etc/passwdCommon built‑in functions include index(), sub(), match(), and length(). Formatting can be done with printf.
12. Iterating over file content
Loop through lines, words, or characters using while read, awk, or Bash parameter expansion.
while read line; do echo $line; done < file.txt for word in $line; do echo $word; done for ((i=0;i<${#word};i++)); do echo ${word:i:1}; doneNote: {} in find -exec is replaced by the current file name for each match.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Open Source Linux
Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
