Master Essential Linux Shell Tools for Text Processing
This article provides a comprehensive guide to core Linux shell utilities—including find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk—showing practical examples, common options, and how to combine them for efficient text manipulation and system tasks.
Linux shell is a fundamental skill; despite its quirky syntax, mastering tools like find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk is essential for text processing and system tasks.
1. find File Search
Examples of searching for .txt and .pdf files, using regex, negation, depth, size, permissions, user, and performing actions like delete or exec.
find . \( -name "*.txt" -o -name "*.pdf" \) -print
find . -regex ".*(\.txt|\.pdf)$"
find . ! -name "*.txt" -print
find . -maxdepth 1 -type f
find . -type d -print
find . -atime 7 -type f -print
find . -type f -size +2k
find . -type f -perm 644 -print
find . -type f -user weber -print
find . -type f -name "*.swp" -delete
find . -type f -user root -exec chown weber {} \;
find . -mtime +10 -name "*.txt" -exec cp {} OLD \;2. grep Text Search
Basic usage and common options: -c count, -n line number, -i ignore case, -l list filenames, recursive search, multiple patterns, and null‑terminated output.
grep -c "text" filename
grep -n "class" . -R
grep -e "class" -e "virtual" file
grep "test" * -lZ | xargs -0 rm3. xargs Argument Conversion
xargs converts input into command‑line arguments, useful with grep, find, etc. Shows converting multiline to single line, specifying delimiters, and using -I for placeholder.
cat file.txt | xargs
cat single.txt | xargs -n 3
cat file.txt | xargs -I {} ./command.sh {}
find source_dir -type f -name "*.cpp" -print0 | xargs -0 wc -l4. sort Sorting
Numeric vs dictionary sort, reverse, key field selection.
sort -nrk 1 data.txt
sort -bd data.txt5. uniq Remove Duplicates
Remove duplicate lines, count occurrences, show only duplicates, and limit comparison range.
sort unsort.txt | uniq
sort unsort.txt | uniq -c
sort unsort.txt | uniq -d
sort unsort.txt | uniq -s 2 -w 56. tr Transform
Character translation, deletion, complement, and squeezing.
echo 12345 | tr '0-9' '9876543210'
cat text | tr '\t' ' '
cat file | tr -d '0-9'
cat file | tr -c '0-9'
cat file | tr -s ' '7. cut Column Extraction
Extract specific fields or characters, specify delimiter.
cut -f2,4 filename
cut -f3 --complement filename
cut -d ";" -f2 filename
cut -c1-5 file8. paste Column Merging
Combine files column‑wise, default tab delimiter, custom delimiter with -d.
paste file1 file2
paste file1 file2 -d ","9. wc Word/Line Count
Count lines, words, characters.
wc -l file
wc -w file
wc -c file10. sed Stream Editing
Substitution, global replace, in‑place editing, delete empty lines, use variables, and advanced patterns.
sed 's/text/replace_text/' file
sed 's/text/replace_text/g' file
sed -i 's/text/replace_text/g' file
sed '/^$/d' file
echo "this is an example" | sed 's/\w+/[&]/g'
sed 's/^.{3}/&\//g' file11. awk Data‑flow Processing
Structure with BEGIN, main, END blocks; printing, field handling, built‑in variables, functions, loops, and implementing head/tail.
awk 'BEGIN{print "start"} {print} END{print "End"}' file
awk '{print NR":"$0"-"$1"-"$2}' file
awk 'NR<=10{print}' filename
awk '{buffer[NR%10]=$0} END{for(i=0;i<10;i++) print buffer[i]}' filename12. Iterating Lines, Words, Characters
While‑read loop, awk, for‑in word loop, and Bash string slicing for character iteration.
while read line; do echo $line; done < file.txt
awk '{print}' file.txt
for word in $line; do echo $word; done
for ((i=0;i<${#word};i++)); do echo ${word:i:1}; doneSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
