Operations 14 min read

Common Linux Shell Text Processing Tools: find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, awk

This article provides a practical guide to the most frequently used Linux shell utilities for text processing, covering find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk with clear examples and essential options.

Qunar Tech Salon

Apr 22, 2016

Common Linux Shell Text Processing Tools: find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, awk

This article introduces the most commonly used Linux shell tools for text processing, including find , grep , xargs , sort , uniq , tr , cut , paste , wc , sed , and awk , and provides practical examples and typical parameters.

find can locate files by name, extension, regular expression, depth, type, time, size, permissions, or owner. Example usages:

find . \( -name "*.txt" -o -name "*.pdf" \) -print

find . -regex ".*\(\.txt|\.pdf\)$"

find . ! -name "*.txt" -print

find . -maxdepth 1 -type f

find . -type d -print

find . -type f -size +2k

find . -type f -perm 644 -print

find . -type f -user weber -print

find . -type f -name "*.swp" -delete

find . -type f -user root -exec chown weber {} \;

find . -type f -mtime +10 -name "*.txt" -exec cp {} OLD \;

grep searches for patterns in files. Common options include -c (count), -n (line numbers), -i (ignore case), -l (list filenames), and -v (invert match). Recursive search example: grep "class" . -R -n Multiple patterns: grep -e "class" -e "virtual" file Zero‑terminated output combined with xargs: grep "test" file* -lZ | xargs -0 rm xargs converts input data into command‑line arguments, enabling powerful pipelines. Examples: cat file.txt | xargs (single‑line output) cat single.txt | xargs -n 3 (group 3 items per command) cat file.txt | xargs -I {} ./command.sh -p {} -1 Use -0 with null‑terminated input: find . -type f -print0 | xargs -0 wc -l sort orders lines. Useful flags: -n (numeric), -d (dictionary), -r (reverse), -k N (key column). Example:

sort -nrk 1 data.txt

sort -bd data

uniq removes duplicate lines after sorting. Examples: sort unsort.txt | uniq (deduplicate) sort unsort.txt | uniq -c (count occurrences) sort unsort.txt | uniq -d (show duplicates only)

tr translates or deletes characters. Examples: echo 12345 | tr '0-9' '9876543210' (simple cipher) cat text | tr '\t' ' ' (tabs to spaces) cat file | tr -d '0-9' (remove digits) cat file | tr -c '0-9' (keep only digits) cat file | tr -s ' ' (squeeze repeated spaces) tr '[:lower:]' '[:upper:]' (lower‑to‑upper)

cut extracts columns or fields. Examples: cut -f2,4 filename (columns 2 and 4) cut -f3 --complement filename (all except column 3) cat -f2 -d ";" filename (use ';' as delimiter) cut -c1-5 file (first five characters) cut -c-2 file (first two characters)

paste merges files line‑wise by columns. Default delimiter is a tab; a custom delimiter can be set with -d. paste file1 file2 produces: 1\tcolin 2\tbook paste file1 file2 -d "," yields: 1,colin 2,book

wc counts lines, words, or bytes: wc -l file (lines) wc -w file (words) wc -c file (bytes)

sed performs stream editing. Common forms: sed 's/text/replace_text/' file (first occurrence) sed 's/text/replace_text/g' file (global) sed -i 's/text/replace_text/g' file (in‑place) sed '/^$/d' file (delete empty lines) Variable substitution example:

p=pattern; r=replace; echo "line with pattern" | sed "s/$p/$r/g"

awk is a powerful data‑stream processor. Its script structure is awk 'BEGIN{...} { ... } END{...}' file. Typical operations include printing fields, counting, summing, and pattern matching. Examples:

awk '{print NR":"$0"-"$1"-"$2}' file

awk '{print $2, $3}' file

awk 'END{print NR}' file

awk 'BEGIN{sum=0} {sum+=$1} END{print sum}' file

Reading command output:

awk '{"grep root /etc/passwd" | getline cmdout; print cmdout}'

Implementing head and tail:

awk 'NR<=10{print}' filename

awk '{buffer[NR%10]=$0} END{for(i=0;i<10;i++) print buffer[i]}' filename

Field extraction with custom delimiter: awk -F: '{print $NF}' /etc/passwd Range selection: awk 'NR==4,NR==6{print}' file Pattern range: awk '/start_pattern/,/end_pattern/' file The examples are taken from a tutorial originally published on cnblogs (http://www.cnblogs.com/me115/p/3427319.html).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Linux Shell text processing grep awk find

Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.