Master Essential Linux Shell Text Tools: Find, Grep, Awk, Sed & More
This guide introduces the most frequently used Linux shell utilities for text processing—including find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk—explaining their core options, practical examples, and how to combine them for powerful command‑line workflows.
Linux Shell is a fundamental skill; despite its quirky syntax and lower readability compared to languages like Python, mastering it reveals many aspects of the Linux system and remains essential for everyday scripting.
Key tools for text handling in Linux include find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk . The examples and parameters shown are the most common and practical. The author prefers one‑line commands and advises using Python for more complex tasks.
1. find – File Search
Search for *.txt and *.pdf files:
find . \( -name "*.txt" -o -name "*.pdf" \) -printRegex search: find . -regex ".*\(\.txt|\.pdf\)$" Case‑insensitive regex: find . -iregex ".*\.txt$" Negate pattern (exclude txt files): find . ! -name "*.txt" -print Limit search depth (depth = 1): find . -maxdepth 1 -type f Search by type:
find . -type d -print # directories only find . -type f -print # regular filesSearch by time:
atime – access time (days)
mtime – modification time
ctime – metadata change time
find . -atime 7 -type f -print # accessed in last 7 daysSearch by size (e.g., larger than 2 KB): find . -type f -size +2k Search by permission: find . -type f -perm 644 -print Search by owner: find . -type f -user weber -print Actions after finding:
Delete: find . -type f -name "*.swp" -delete Execute (powerful -exec):
find . -type f -user root -exec chown weber {} \;Note: {} is replaced by each matched file name.
find . -type f -mtime +10 -name "*.txt" -exec cp {} OLD \;Combine multiple commands via a script: -exec ./commands.sh {} \; Output delimiters:
Default:
'
' -print0uses '\0' to handle spaces in file names.
2. grep – Text Search
Basic usage: grep match_pattern file Common options: -o output only matching part, -v invert match -c count matches -n show line numbers -i ignore case -l list matching file names
Recursive search (favorite for code): grep "class" . -R -n Match multiple patterns: grep -e "class" -e "virtual" file Use -z to output file names terminated by \0:
grep "test" file* -lZ | xargs -0 rm3. xargs – Build Command Lines
Convert input lines to arguments, useful with grep or find: cat file.txt | xargs Convert multi‑line output to a single line: cat file.txt | xargs Convert a single line to multiple lines (e.g., three arguments per line): cat single.txt | xargs -n 3 Key options: -d define delimiter (default space, \n for lines) -n number of arguments per command line -I {} replace placeholder with each argument -0 use \0 as delimiter
find source_dir/ -type f -name "*.cpp" -print0 | xargs -0 wc -l4. sort – Sorting
Options: -n numeric sort, -d dictionary order -r reverse -k N sort by column N
sort -nrk 1 data.txt sort -bd data # ignore leading blanks5. uniq – Remove Duplicate Lines
sort unsort.txt | uniqCount occurrences: sort unsort.txt | uniq -c Show only duplicated lines: sort unsort.txt | uniq -d Compare specific fields with -s (start) and -w (width).
6. tr – Translate/Replace Characters
echo 12345 | tr '0-9' '9876543210' # simple cipher cat text | tr '\t' ' ' # tabs to spacesDelete characters: cat file | tr -d '0-9' Complement set ( -c):
cat file | tr -c '0-9' cat file | tr -d -c '0-9
'Compress repeated characters ( -s), often used to squeeze spaces: cat file | tr -s ' ' Character classes (e.g., [:lower:], [:digit:], [:space:]).
tr '[:lower:]' '[:upper:]'7. cut – Column Extraction
cut -f2,4 filename cut -f3 --complement filename # all but column 3Specify delimiter with -d: cat -f2 -d ";" filename Ranges: N- from field N to end -M first M fields N-M fields N through M
Units: -b bytes -c characters -f fields (delimiter‑based)
cut -c1-5 file # first 5 characters cut -c-2 file # first 2 characters8. paste – Merge Columns
paste file1 file2Change delimiter (default tab) with -d:
paste file1 file2 -d ","9. wc – Count Lines, Words, Bytes
wc -l file # lines wc -w file # words wc -c file # bytes10. sed – Stream Editing
Replace first occurrence: sed 's/text/replace_text/' file Global replace: sed 's/text/replace_text/g' file Edit file in place: sed -i 's/text/replace_text/g' file Delete empty lines: sed '/^$/d' file Use variables and double quotes for evaluation:
p=pattern; r=replace; echo "a line with $p" | sed "s/$p/$r/g"11. awk – Data‑Stream Processing
Structure: awk 'BEGIN{...} { ... } END{...}' file Built‑in variables: NR (record number), NF (field count), $0 (whole line), $1, $2, … awk '{print $2, $3}' file Count lines: awk 'END{print NR}' file Sum first column: awk '{sum+=$1} END{print sum}' file Pass external variable: var=1000; awk -v v=$var '{print v}' file Set field separator: awk -F: '{print $NF}' /etc/passwd Read command output:
awk '{"grep root /etc/passwd" | getline out; print out}'Implement head and tail:
awk 'NR<=10{print}' file # head awk '{buf[NR%10]=$0} END{for(i=0;i<10;i++) print buf[i]}' file # tailCommon functions: index(), sub(), match(), length(), printf().
12. Iterating Lines, Words, and Characters
While‑read loop:
while read line; do echo "$line"; done < file.txtAwk alternative: cat file.txt | awk '{print}' Iterate words in a line: for word in $line; do echo $word; done Iterate characters using Bash substring syntax:
for ((i=0;i<${#word};i++)); do echo ${word:i:1}; doneSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
