Fundamentals 15 min read

Master Essential Linux Shell Tools for Text Processing and Automation

This guide introduces the most commonly used Linux shell utilities—find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk—explaining their core options and providing practical examples to help you efficiently manipulate and process text files from the command line.

Open Source Linux
Open Source Linux
Open Source Linux
Master Essential Linux Shell Tools for Text Processing and Automation

Linux shell is a fundamental skill for interacting with the operating system; mastering its text‑processing utilities greatly improves productivity and deepens understanding of Linux internals.

1. find – file search

Search for files by name, type, time, size, permissions, or owner, and perform actions such as delete or exec.

find . \( -name "*.txt" -o -name "*.pdf" \) -print
find . -regex ".*\(.txt|\.pdf\)$"
find . ! -name "*.txt" -print
find . -maxdepth 1 -type f
find . -type d -print   # list directories only
find . -atime 7 -type f -print   # accessed in last 7 days
find . -type f -size +2k
find . -type f -perm 644 -print
find . -type f -user weber -print
find . -type f -name "*.swp" -delete
find . -type f -user root -exec chown weber {} \;
find . -type f -mtime +10 -name "*.txt" -exec cp {} OLD \;
-exec ./commands.sh {} \;

Use -print0 to separate filenames with a null character, handling spaces safely.

2. grep – text search

Search files for matching patterns with a rich set of options.

grep -c "text" filename          # count matches
grep -n "pattern" file          # show line numbers
grep -i "pattern" file          # case‑insensitive
grep -l "pattern" *            # list matching file names
grep "class" . -R -n           # recursive search with line numbers
grep -e "class" -e "virtual" file
grep "test" file* -lZ | xargs -0 rm   # delete files whose names contain "test"

3. xargs – build command lines from input

Convert input streams into arguments for other commands.

cat file.txt | xargs
cat single.txt | xargs -n 3

Key options: -d to define a delimiter (default space, \n for multiline), -n to limit arguments per command, -I {} to replace a placeholder, and -0 to use a null delimiter.

cat file.txt | xargs -I {} ./command.sh -p {} -1
find source_dir/ -type f -name "*.cpp" -print0 | xargs -0 wc -l

4. sort – ordering lines

Sort text files numerically, alphabetically, in reverse, or by specific fields.

sort -nrk 1 data.txt
sort -bd data   # ignore leading blanks

5. uniq – remove duplicate lines

Filter adjacent duplicate lines and optionally count occurrences.

sort unsort.txt | uniq
sort unsort.txt | uniq -c   # count each unique line
sort unsort.txt | uniq -d   # show only duplicated lines

6. tr – translate or delete characters

Common uses include character substitution, deletion, complement, and squeezing repeats.

echo 12345 | tr '0-9' '9876543210'
cat text | tr '\t' ' '
cat file | tr -d '0-9'
cat file | tr -c '0-9'
cat file | tr -d -c '0-9 
'
cat file | tr -s ' '

Character classes such as [:lower:], [:upper:], [:digit:], etc., can be used: tr '[:lower:]' '[:upper:]'.

7. cut – extract columns

Extract specific fields or characters from each line.

cut -f2,4 filename
cut -f3 --complement filename   # all but the 3rd column
cut -d ";" -f2 filename
cut -c1-5 file   # first five characters

8. paste – merge columns

Combine files line‑by‑line, using a tab or a custom delimiter.

paste file1 file2
paste file1 file2 -d ","

9. wc – word, line, and byte count

Quickly obtain file statistics.

wc -l file   # lines
wc -w file   # words
wc -c file   # bytes

10. sed – stream editor for substitutions

Perform in‑place or output‑only text transformations.

sed 's/text/replace_text/' file          # replace first occurrence
sed 's/text/replace_text/g' file        # replace all occurrences
sed -i 's/text/replace_text/g' file      # edit file in place
sed '/^$/d' file                        # delete empty lines
echo "this is an example" | sed 's/\w+/[&]/g'

Using double quotes allows variable expansion inside the expression.

11. awk – powerful pattern‑scanning and processing language

Typical script structure: BEGIN{...} { ... } END{...}.

echo -e "line1
line2" | awk 'BEGIN{print "start"} {print} END{print "end"}'
awk '{print $2, $3}' file
awk 'END{print NR}' file   # total number of lines
awk 'NR<5' file           # first four lines
awk -F: '{print $NF}' /etc/passwd

Common built‑in functions include index(), sub(), match(), and length(). Formatting can be done with printf.

12. Iterating over file content

Loop through lines, words, or characters using while read, awk, or Bash parameter expansion.

while read line; do echo $line; done < file.txt
for word in $line; do echo $word; done
for ((i=0;i<${#word};i++)); do echo ${word:i:1}; done
Note: {} in find -exec is replaced by the current file name for each match.
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Bashtext processingUnix tools
Open Source Linux
Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.