Fundamentals 15 min read

Master Essential Linux Shell Tools for Text Processing and Automation

This guide introduces the most commonly used Linux shell utilities—find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk—explaining their core options and providing practical examples to help you efficiently manipulate and process text files from the command line.

Open Source Linux

Sep 8, 2021

Master Essential Linux Shell Tools for Text Processing and Automation

Linux shell is a fundamental skill for interacting with the operating system; mastering its text‑processing utilities greatly improves productivity and deepens understanding of Linux internals.

1. find – file search

Search for files by name, type, time, size, permissions, or owner, and perform actions such as delete or exec.

find . \( -name "*.txt" -o -name "*.pdf" \) -print

find . -regex ".*\(.txt|\.pdf\)$"

find . ! -name "*.txt" -print

find . -maxdepth 1 -type f

find . -type d -print   # list directories only

find . -atime 7 -type f -print   # accessed in last 7 days

find . -type f -size +2k

find . -type f -perm 644 -print

find . -type f -user weber -print

find . -type f -name "*.swp" -delete

find . -type f -user root -exec chown weber {} \;

find . -type f -mtime +10 -name "*.txt" -exec cp {} OLD \;

-exec ./commands.sh {} \;

Use -print0 to separate filenames with a null character, handling spaces safely.

2. grep – text search

Search files for matching patterns with a rich set of options.

grep -c "text" filename          # count matches

grep -n "pattern" file          # show line numbers

grep -i "pattern" file          # case‑insensitive

grep -l "pattern" *            # list matching file names

grep "class" . -R -n           # recursive search with line numbers

grep -e "class" -e "virtual" file

grep "test" file* -lZ | xargs -0 rm   # delete files whose names contain "test"

3. xargs – build command lines from input

Convert input streams into arguments for other commands.

cat file.txt | xargs

cat single.txt | xargs -n 3

Key options: -d to define a delimiter (default space, \n for multiline), -n to limit arguments per command, -I {} to replace a placeholder, and -0 to use a null delimiter.

cat file.txt | xargs -I {} ./command.sh -p {} -1

find source_dir/ -type f -name "*.cpp" -print0 | xargs -0 wc -l

4. sort – ordering lines

Sort text files numerically, alphabetically, in reverse, or by specific fields.

sort -nrk 1 data.txt

sort -bd data   # ignore leading blanks

5. uniq – remove duplicate lines

Filter adjacent duplicate lines and optionally count occurrences.

sort unsort.txt | uniq

sort unsort.txt | uniq -c   # count each unique line

sort unsort.txt | uniq -d   # show only duplicated lines

6. tr – translate or delete characters

Common uses include character substitution, deletion, complement, and squeezing repeats.

echo 12345 | tr '0-9' '9876543210'

cat text | tr '\t' ' '

cat file | tr -d '0-9'

cat file | tr -c '0-9'

cat file | tr -d -c '0-9 
'

cat file | tr -s ' '

Character classes such as [:lower:], [:upper:], [:digit:], etc., can be used: tr '[:lower:]' '[:upper:]'.

7. cut – extract columns

Extract specific fields or characters from each line.

cut -f2,4 filename

cut -f3 --complement filename   # all but the 3rd column

cut -d ";" -f2 filename

cut -c1-5 file   # first five characters

8. paste – merge columns

Combine files line‑by‑line, using a tab or a custom delimiter.

paste file1 file2

paste file1 file2 -d ","

9. wc – word, line, and byte count

Quickly obtain file statistics.

wc -l file   # lines

wc -w file   # words

wc -c file   # bytes

10. sed – stream editor for substitutions

Perform in‑place or output‑only text transformations.

sed 's/text/replace_text/' file          # replace first occurrence

sed 's/text/replace_text/g' file        # replace all occurrences

sed -i 's/text/replace_text/g' file      # edit file in place

sed '/^$/d' file                        # delete empty lines

echo "this is an example" | sed 's/\w+/[&]/g'

Using double quotes allows variable expansion inside the expression.

11. awk – powerful pattern‑scanning and processing language

Typical script structure: BEGIN{...} { ... } END{...}.

echo -e "line1
line2" | awk 'BEGIN{print "start"} {print} END{print "end"}'

awk '{print $2, $3}' file

awk 'END{print NR}' file   # total number of lines

awk 'NR<5' file           # first four lines

awk -F: '{print $NF}' /etc/passwd

Common built‑in functions include index(), sub(), match(), and length(). Formatting can be done with printf.

12. Iterating over file content

Loop through lines, words, or characters using while read, awk, or Bash parameter expansion.

while read line; do echo $line; done < file.txt

for word in $line; do echo $word; done

for ((i=0;i<${#word};i++)); do echo ${word:i:1}; done

Note: {} in find -exec is replaced by the current file name for each match.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

bash text processing Unix tools

Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.