Master Essential Linux Shell Tools: find, grep, awk, and More
This guide presents a comprehensive overview of the most frequently used Linux shell utilities for text processing—such as find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk—offering practical examples, key options, and best‑practice recommendations for efficient command‑line workflows.
Linux Shell is a fundamental skill; despite its quirky syntax and low readability, it is often replaced by scripts like Python. However, mastering it is essential because working with shell scripts reveals many aspects of the Linux system.
The most commonly used tools for text processing in Linux are: find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk.
1. find – File Search
Search for txt and pdf files:
find . \( -name "*.txt" -o -name "*.pdf" \) -printRegex search for .txt and .pdf: find . -regex ".*\(\.txt|\.pdf\)$" Case‑insensitive regex: find . -iregex ".*\.txt$" Find all non‑txt files: find . ! -name "*.txt" -print Limit search depth (depth 1): find . -maxdepth 1 -type f Search by type (directories only): find . -type d -print Search by time:
‑atime: access time (days)
‑mtime: modification time
‑ctime: change time (metadata)
Files accessed in the last 7 days: find . -atime 7 -type f -print Search by size (greater than 2 kB): find . -type f -size +2k Search by permission (e.g., 644): find . -type f -perm 644 -print Search by user: find . -type f -user weber -print Delete all *.swp files in the current directory: find . -type f -name "*.swp" -delete Execute a command on each matched file (change ownership to user weber):
find . -type f -user root -exec chown weber {} \;Note: {} is a placeholder that is replaced by the current file name for each match.
Copy found files to another directory:
find . -type f -mtime +10 -name "*.txt" -exec cp {} OLD \;Combine multiple commands by invoking a script with -exec:
find . -type f -print -exec ./commands.sh {} \;2. grep – Text Search
Basic usage prints matching lines: grep "pattern" file Common options:
-o: output only the matching part
-v: invert match (output non‑matching lines)
-c: count matching lines
-n: show line numbers
-i: ignore case
-l: list only file names
Recursive search in multiple directories (favorite for code search): grep "class" . -R -n Match multiple patterns: grep -e "class" -e "virtual" file Use null‑terminated output (‑z) for safe piping:
grep "test" file* -lZ | xargs -0 rm3. xargs – Build Command Lines from Input
xargs converts input data into command‑line arguments, allowing combination with other commands such as grep or find.
Convert multiline output to a single line: cat file.txt | xargs Convert a single line to multiple lines (‑n specifies fields per line): cat single.txt | xargs -n 3 Key options:
-d: define delimiter (default space, newline is \n)
-n: specify number of arguments per command line
-I {}: replace {} with the input item
-0: use null character as delimiter
Example – run a script for each line:
cat file.txt | xargs -I {} ./command.sh -p {} -1Example – count lines of C++ source files:
find source_dir/ -type f -name "*.cpp" -print0 | xargs -0 wc -l4. sort – Sorting
Options:
-n: numeric sort (vs. -d dictionary order)
-r: reverse order
-k N: sort by the N‑th column
Examples:
sort -nrk 1 data.txt sort -bd data # ignore leading blanks5. uniq – Remove Duplicate Lines
Basic usage: sort unsort.txt | uniq Count occurrences: sort unsort.txt | uniq -c Show only duplicate lines: sort unsort.txt | uniq -d Specify fields to compare (‑s start, ‑w width):
sort unsort.txt | uniq -f 2 -s 5 -w 106. tr – Translate or Delete Characters
General usage:
echo 12345 | tr '0-9' '9876543210' # simple substitution cat text | tr '\t' ' ' # tab to spaceDelete characters: cat file | tr -d '0-9' # remove all digits Complement set (‑c):
cat file | tr -c '0-9' # keep only digits cat file | tr -d -c '0-9
' # delete non‑digitsCompress repeated characters (‑s): cat file | tr -s ' ' Character classes (e.g., [:lower:], [:upper:]):
tr '[:lower:]' '[:upper:]'7. cut – Extract Columns
Extract columns 2 and 4: cut -f2,4 filename Exclude column 3: cut -f3 --complement filename Specify delimiter: cut -d ";" -f2 filename Field ranges:
N‑: from field N to end
M‑N: fields M through N
Units:
-b: bytes
-c: characters
-f: fields (delimiter‑based)
Examples:
cut -c1-5 file # first five characters cut -c-2 file # first two characters8. paste – Merge Files Linewise
Combine two files column‑wise (default delimiter is tab): paste file1 file2 Specify a different delimiter (e.g., comma):
paste file1 file2 -d ","9. wc – Word, Line, and Byte Count
Count lines: wc -l file Count words: wc -w file Count bytes:
wc -c file10. sed – Stream Editor for Text Substitution
Replace first occurrence on each line: sed 's/text/replace_text/' file Global replacement: sed 's/text/replace_text/g' file Edit file in place (‑i): sed -i 's/text/replace_text/g' file Delete empty lines: sed '/^$/d' file Use captured groups: sed 's/hello\([0-9]\)/\1/' Variable substitution with double quotes:
p=pattern; r=replace; echo "a line with pattern" | sed "s/$p/$r/g"11. awk – Powerful Text Processing Language
Basic script structure:
awk 'BEGIN{print "start"} {print} END{print "end"}' fileKey built‑in variables:
NR – record number (line number)
NF – number of fields
$0 – entire line
$1, $2 … – individual fields
Print specific fields: awk '{print $2, $3}' file Count lines: awk 'END{print NR}' file Sum first column: awk '{sum+=$1} END{print sum}' file Filter by line number: awk 'NR<5' file Filter by pattern: awk '/linux/' file Set field delimiter (‑F): awk -F: '{print $NF}' /etc/passwd Read command output with getline:
echo | awk '{"grep root /etc/passwd" | getline cmdout; print cmdout}'Implement head (first 10 lines): awk 'NR<=10{print}' filename Implement tail (last 10 lines):
awk '{buf[NR%10]=$0} END{for(i=0;i<10;i++) print buf[i]}' filenameSource: 大CC, http://www.cnblogs.com/me115/p/3427319.html (originally from the public account “民工哥技术之路”).
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
