Operations 14 min read

Common Linux Shell Text Processing Tools: find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, awk

This article provides a practical guide to the most frequently used Linux shell utilities for text processing, covering find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk with clear examples and essential options.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Common Linux Shell Text Processing Tools: find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, awk

This article introduces the most commonly used Linux shell tools for text processing, including find , grep , xargs , sort , uniq , tr , cut , paste , wc , sed , and awk , and provides practical examples and typical parameters.

find can locate files by name, extension, regular expression, depth, type, time, size, permissions, or owner. Example usages: find . \( -name "*.txt" -o -name "*.pdf" \) -print find . -regex ".*\(\.txt|\.pdf\)$" find . ! -name "*.txt" -print find . -maxdepth 1 -type f find . -type d -print find . -type f -size +2k find . -type f -perm 644 -print find . -type f -user weber -print find . -type f -name "*.swp" -delete find . -type f -user root -exec chown weber {} \; find . -type f -mtime +10 -name "*.txt" -exec cp {} OLD \;

grep searches for patterns in files. Common options include -c (count), -n (line numbers), -i (ignore case), -l (list filenames), and -v (invert match). Recursive search example: grep "class" . -R -n Multiple patterns: grep -e "class" -e "virtual" file Zero‑terminated output combined with xargs : grep "test" file* -lZ | xargs -0 rm

xargs converts input data into command‑line arguments, enabling powerful pipelines. Examples: cat file.txt | xargs (single‑line output) cat single.txt | xargs -n 3 (group 3 items per command) cat file.txt | xargs -I {} ./command.sh -p {} -1 Use -0 with null‑terminated input: find . -type f -print0 | xargs -0 wc -l

sort orders lines. Useful flags: -n (numeric), -d (dictionary), -r (reverse), -k N (key column). Example: sort -nrk 1 data.txt sort -bd data

uniq removes duplicate lines after sorting. Examples: sort unsort.txt | uniq (deduplicate) sort unsort.txt | uniq -c (count occurrences) sort unsort.txt | uniq -d (show duplicates only)

tr translates or deletes characters. Examples: echo 12345 | tr '0-9' '9876543210' (simple cipher) cat text | tr '\t' ' ' (tabs to spaces) cat file | tr -d '0-9' (remove digits) cat file | tr -c '0-9' (keep only digits) cat file | tr -s ' ' (squeeze repeated spaces) tr '[:lower:]' '[:upper:]' (lower‑to‑upper)

cut extracts columns or fields. Examples: cut -f2,4 filename (columns 2 and 4) cut -f3 --complement filename (all except column 3) cat -f2 -d ";" filename (use ';' as delimiter) cut -c1-5 file (first five characters) cut -c-2 file (first two characters)

paste merges files line‑wise by columns. Default delimiter is a tab; a custom delimiter can be set with -d . paste file1 file2 produces: 1\tcolin 2\tbook paste file1 file2 -d "," yields: 1,colin 2,book

wc counts lines, words, or bytes: wc -l file (lines) wc -w file (words) wc -c file (bytes)

sed performs stream editing. Common forms: sed 's/text/replace_text/' file (first occurrence) sed 's/text/replace_text/g' file (global) sed -i 's/text/replace_text/g' file (in‑place) sed '/^$/d' file (delete empty lines) Variable substitution example: p=pattern; r=replace; echo "line with pattern" | sed "s/$p/$r/g"

awk is a powerful data‑stream processor. Its script structure is awk 'BEGIN{...} { ... } END{...}' file . Typical operations include printing fields, counting, summing, and pattern matching. Examples: awk '{print NR":"$0"-"$1"-"$2}' file awk '{print $2, $3}' file awk 'END{print NR}' file awk 'BEGIN{sum=0} {sum+=$1} END{print sum}' file Reading command output: awk '{"grep root /etc/passwd" | getline cmdout; print cmdout}' Implementing head and tail : awk 'NR<=10{print}' filename awk '{buffer[NR%10]=$0} END{for(i=0;i<10;i++) print buffer[i]}' filename Field extraction with custom delimiter: awk -F: '{print $NF}' /etc/passwd Range selection: awk 'NR==4,NR==6{print}' file Pattern range: awk '/start_pattern/,/end_pattern/' file

The examples are taken from a tutorial originally published on cnblogs (http://www.cnblogs.com/me115/p/3427319.html).

linuxShelltext processinggrepawkfind
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.