Fundamentals 19 min read

Master Essential Linux Shell Text Processing Tools: find, grep, awk, and More

This article provides a comprehensive guide to the most frequently used Linux shell text‑processing utilities—find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk—offering practical examples, command‑line options, and tips for efficient one‑ or two‑line scripts.

Efficient Ops

Oct 13, 2019

Master Essential Linux Shell Text Processing Tools: find, grep, awk, and More

Linux Shell is a fundamental skill; although its syntax can be quirky and readability poor, it is often replaced by scripts such as Python. Because it is a basic competency, mastering it is important, as learning Shell scripts also reveals many aspects of the Linux system.

Becoming a Linux scripting master is not easy for everyone, but using simple Shell commands to achieve common basic functions is still necessary.

Below is an introduction to the most commonly used tools for text processing in Linux: find , grep , xargs , sort , uniq , tr , cut , paste , wc , sed , awk . The examples and parameters shown are the most practical; the principle is to write commands in a single line, preferably not exceeding two lines. For more complex tasks, consider Python.

1. find – File Search

Search for txt and pdf files:

find . ( -name "*.txt" -o -name "*.pdf" ) -print

Search using regular expressions for .txt and .pdf: find . -regex ".*(.txt|.pdf)$" Use -iregex to ignore case.

Negate pattern – find all non‑txt files: find . ! -name "*.txt" -print Specify search depth – list files in the current directory (depth 1):

find . -maxdepth 1 -type f

Custom Search

Search by type (list only directories): find . -type d -print Search by time: -atime – access time (days; use -amin for minutes) -mtime – modification time -ctime – change time (metadata or permission changes)

Files accessed in the last 7 days: find . -atime 7 -type f -print Search by size (k, M, G). Find files larger than 2 kB: find . -type f -size +2k Search by permission (e.g., find files with permission 644): find . -type f -perm 644 -print Search by user:

find . -type f -user weber -print

Post‑Search Actions

Delete all *.swp files in the current directory: find . -type f -name "*.swp" -delete Execute a command on each match (powerful -exec):

find . -type f -user root -exec chown weber {} \;

Note: {} is replaced by the current file name. Example – copy found files to another directory:

find . -type f -mtime +10 -name "*.txt" -exec cp {} OLD \;

Combine multiple commands by writing a script and invoking it with -exec:

-exec ./commands.sh {} \;

Print Delimiters

By default -print uses a newline as the delimiter. -print0 uses a null character, allowing handling of filenames containing spaces.

2. grep – Text Search

Basic usage: grep -c "text" filename Common options: -o – output only the matching part -v – output lines that do NOT match -c – count matching lines -n – print line numbers -i – ignore case -l – print only file names

Recursive search in multi‑level directories (a programmer’s favorite): grep "class" . -R -n Match multiple patterns: grep -e "class" -e "virtual" file Use -Z to output file names with a null terminator, then delete them with xargs -0:

grep "test" file* -lZ | xargs -0 rm

3. xargs – Convert Input to Command‑Line Arguments

xargs

transforms input data into arguments for other commands, useful with grep, find, etc.

Convert multi‑line output to a single line: cat file.txt | xargs Convert a single line to multiple lines (e.g., three arguments per line):

cat single.txt | xargs -n 3

xargs Options

-d

– define delimiter (default is space; newline is

) -n – specify number of arguments per command line -I {} – replace placeholder with the input item -0 – input delimiter is null character

Example – count lines in C source files:

find source_dir/ -type f -name "*.cpp" -print0 | xargs -0 wc -l

4. sort – Sorting

Key options: -n – numeric sort (vs. -d dictionary order) -r – reverse order -k N – sort by the N‑th column

Example: sort -nrk 1 data.txt Ignore leading blanks:

sort -bd data

5. uniq – Remove Duplicate Lines

Remove duplicate lines: sort unsort.txt | uniq Count occurrences of each line: sort unsort.txt | uniq -c Show only duplicated lines: sort unsort.txt | uniq -d Specify start position and width with -s and -w.

6. tr – Translate or Delete Characters

General usage:

echo 12345 | tr '0-9' '9876543210'   # simple substitution

cat text | tr '\t' ' '

Delete characters: cat file | tr -d '0-9' # delete all digits Complement set ( -c) to keep only matching characters:

cat file | tr -c '0-9'          # keep only digits

cat file | tr -d -c '0-9'      # delete non‑digits

Compress repeated characters (useful for collapsing spaces): cat file | tr -s ' ' Character classes (e.g., [:lower:], [:upper:], [:digit:], etc.) can be used as:

tr '[:lower:]' '[:upper:]'

7. cut – Extract Columns

Extract the 2nd and 4th columns: cut -f2,4 filename Exclude the 3rd column: cut -f3 --complement filename Specify delimiter (e.g., semicolon): cut -f2 -d ";" filename Range specifications: N- – from field N to the end -M – from the first field to M N-M – fields N through M

Units: -b – bytes -c – characters -f – fields (using the delimiter)

Example – print first five characters: cut -c1-5 file Example – print first two characters:

cut -c-2 file

8. paste – Merge Files Column‑wise

Combine two files column‑wise: paste file1 file2 Default delimiter is a tab; you can set a custom delimiter with -d:

paste file1 file2 -d ","

9. wc – Word, Line, and Byte Count

Count lines: wc -l file Count words: wc -w file Count bytes:

wc -c file

10. sed – Stream Editor for Text Substitution

Replace the first occurrence on each line: sed 's/text/replace_text/' file Global replacement: sed 's/text/replace_text/g' file Edit file in place: sed -i 's/text/replace_text/g' file Delete empty lines: sed '/^$/d' file Use & to reference the matched string: echo "this is an example" | sed 's/w+/[&]/g' Capture groups with parentheses and reference them: sed 's/hello$[0-9]$//' Double‑quoted expressions allow variable expansion:

p=pattern; r=replace; echo "line contains pattern" | sed "s/$p/$r/g"

11. awk – Data‑Stream Processing Tool

Basic script structure:

awk 'BEGIN{ statements } statements END{ statements }' file

Print the current line: awk '{print}' file Print specific fields: awk '{print $2, $3}' file Count lines: awk 'END{print NR}' file Sum the first column:

awk 'BEGIN{sum=0} {sum+=$1} END{print sum}' file

Pass external variables: var=1000; awk '{print $0}' var=$var file Filter by line number or pattern:

awk 'NR<5' file

awk '/linux/' file

Set field delimiter with -F: awk -F: '{print $NF}' /etc/passwd Read command output with getline:

awk '{"grep root /etc/passwd" | getline cmdout; print cmdout}'

Loop constructs:

for(i=0;i<10;i++){print i}

for(i in array){print array[i]}

Implement head and tail:

awk 'NR<=10{print}' filename   # head

awk '{buffer[NR%10]=$0} END{for(i=0;i<10;i++) print buffer[i]}' filename   # tail

Print specific columns using awk or cut:

ls -lrt | awk '{print $6}'

ls -lrt | cut -f6

12. Iterating Over Lines, Words, and Characters

Iterate Over Each Line

while read line; do echo $line; done < file.txt

cat file.txt | while read line; do echo $line; done

cat file.txt | awk '{print}'

Iterate Over Each Word in a Line

for word in $line; do echo $word; done

Iterate Over Each Character

for ((i=0;i<${#word};i++)); do echo ${word:i:1}; done

Source: 大CC Link: http://www.cnblogs.com/me115/p/3427319.html

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Linux Shell text processing Grep awk find

Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.