Operations 14 min read

Master Linux Text Processing: Essential find, grep, sort, awk, and More

This guide walks through the most common Linux shell tools for text processing—including find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk—explaining their core options, practical examples, and how to combine them for powerful command‑line workflows.

Programmer DD
Programmer DD
Programmer DD
Master Linux Text Processing: Essential find, grep, sort, awk, and More

Common Linux Shell Text Processing Tools

Tools covered: find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, awk.

01 find – File Search

Search for files by name, pattern, depth, type, time, size, permission, or owner.

find . ( -name "*.txt" -o -name "*.pdf" ) -print
find . -regex ".*(\.txt|\.pdf)$"
find . ! -name "*.txt" -print
find . -maxdepth 1 -type f
find . -type d -print
find . -atime 7 -type f -print
find . -type f -size +2k
find . -type f -perm 644 -print
find . -type f -user weber -print

Delete files: find . -type f -name "*.swp" -delete Execute a command on each match (strong exec):

find . -type f -user root -exec chown weber {} \;
find . -type f -mtime +10 -name "*.txt" -exec cp {} OLD \;

Use -print0 for null‑terminated output when filenames contain spaces.

02 grep – Text Search

Basic pattern matching and useful options.

grep pattern file
grep -c "text" filename

Recursive search with line numbers: grep "class" -R -n . Match multiple patterns: grep -e "class" -e "virtual" file Combine with xargs to delete matching files:

grep "test" file* -lZ | xargs -0 rm
xargs

converts input into command‑line arguments, useful with many commands such as grep and find.

cat file.txt | xargs
cat single.txt | xargs -n 3

03 sort – Sorting

Numeric vs. dictionary order, reverse, column selection.

sort -nrk 1 data.txt
sort -bd data

04 uniq – Remove Duplicate Lines

Basic deduplication and counting.

sort unsort.txt | uniq
sort unsort.txt | uniq -c
sort unsort.txt | uniq -d

Specify start position and width with -s and -w.

05 tr – Translate Characters

General translation, deletion, complement, and squeezing.

echo 12345 | tr '0-9' '9876543210'
cat text | tr '\t' ' '
cat file | tr -d '0-9'
cat file | tr -c '0-9'
cat file | tr -d -c '0-9 '
cat file | tr -s ' '
tr '[:lower:]' '[:upper:]'

06 cut – Column Extraction

Extract specific fields, complement, set delimiters, and select ranges.

cut -f2,4 filename
cut -f3 --complement filename
cut -d';' -f2 filename
cut -c1-5 file
cut -c-2 file

07 paste – Merge Columns

Combine two files side‑by‑side; default delimiter is a tab, can be changed.

paste file1 file2
paste file1 file2 -d ','

08 wc – Count Lines, Words, Bytes

wc -l file
wc -w file
wc -c file

09 sed – Stream Editing

Single substitution, global substitution, in‑place edit, delete empty lines, capture groups, variable expansion.

sed 's/text/replace_text/' file
sed 's/text/replace_text/g' file
sed -i 's/text/replace_text/g' file
sed '/^$/d' file
sed 's/^.{3}/&/g' file

Use double quotes for variable expansion:

p=pattern; r=replace; echo "line with $p" | sed "s/$p/$r/g"

10 awk – Data‑Stream Processing

Program structure, built‑in variables (NR, NF, $0, $1…), pattern actions, printing, loops, functions, external variable passing, and file/command input.

awk 'BEGIN{print "start"} {print} END{print "End"}'
awk '{print NR ":" $0 "-" $1 "-" $2}'
awk '{print $2, $3}' file
awk 'END{print NR}' file
awk 'BEGIN{sum=0} {sum+=$1} END{print sum}' file
var=1000; awk -v v=$var 'BEGIN{print v}'
awk 'NR<5' file
awk '/linux/' file
awk -F: '{print $NF}' /etc/passwd
awk '{"grep root /etc/passwd" | getline cmdout; print cmdout}'
awk 'for(i=0;i<10;i++){print i}'
awk 'for(i in array){print array[i]}'

Implement head and tail with awk:

awk 'NR<=10{print}' filename   # head
awk '{buffer[NR%10]=$0} END{for(i=0;i<10;i++) print buffer[i]}' filename   # tail

Print specific columns or ranges:

ls -lrt | awk '{print $6}'
awk '{print $2, $3}' file
awk 'NR==4,NR==6{print}' file
awk '/start_pattern/,/end_pattern/' filename

Common built‑in functions: index, sub, match, length, printf.

awk '{"grep root /etc/passwd" | getline cmdout; print length(cmdout)}'
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

LinuxShelltext processingGrepawkfindsed
Programmer DD
Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.