Operations 17 min read

Master Essential Linux Shell Tools: find, grep, awk, and More

This guide presents a comprehensive overview of the most frequently used Linux shell utilities for text processing—such as find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk—offering practical examples, key options, and best‑practice recommendations for efficient command‑line workflows.

Efficient Ops

Jan 14, 2019

Master Essential Linux Shell Tools: find, grep, awk, and More

Linux Shell is a fundamental skill; despite its quirky syntax and low readability, it is often replaced by scripts like Python. However, mastering it is essential because working with shell scripts reveals many aspects of the Linux system.

The most commonly used tools for text processing in Linux are: find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk.

1. find – File Search

Search for txt and pdf files:

find . \( -name "*.txt" -o -name "*.pdf" \) -print

Regex search for .txt and .pdf: find . -regex ".*$\.txt|\.pdf$$" Case‑insensitive regex: find . -iregex ".*\.txt$" Find all non‑txt files: find . ! -name "*.txt" -print Limit search depth (depth 1): find . -maxdepth 1 -type f Search by type (directories only): find . -type d -print Search by time:

‑atime: access time (days)

‑mtime: modification time

‑ctime: change time (metadata)

Files accessed in the last 7 days: find . -atime 7 -type f -print Search by size (greater than 2 kB): find . -type f -size +2k Search by permission (e.g., 644): find . -type f -perm 644 -print Search by user: find . -type f -user weber -print Delete all *.swp files in the current directory: find . -type f -name "*.swp" -delete Execute a command on each matched file (change ownership to user weber):

find . -type f -user root -exec chown weber {} \;

Note: {} is a placeholder that is replaced by the current file name for each match.

Copy found files to another directory:

find . -type f -mtime +10 -name "*.txt" -exec cp {} OLD \;

Combine multiple commands by invoking a script with -exec:

find . -type f -print -exec ./commands.sh {} \;

2. grep – Text Search

Basic usage prints matching lines: grep "pattern" file Common options:

-o: output only the matching part

-v: invert match (output non‑matching lines)

-c: count matching lines

-n: show line numbers

-i: ignore case

-l: list only file names

Recursive search in multiple directories (favorite for code search): grep "class" . -R -n Match multiple patterns: grep -e "class" -e "virtual" file Use null‑terminated output (‑z) for safe piping:

grep "test" file* -lZ | xargs -0 rm

3. xargs – Build Command Lines from Input

xargs converts input data into command‑line arguments, allowing combination with other commands such as grep or find.

Convert multiline output to a single line: cat file.txt | xargs Convert a single line to multiple lines (‑n specifies fields per line): cat single.txt | xargs -n 3 Key options:

-d: define delimiter (default space, newline is \n)

-n: specify number of arguments per command line

-I {}: replace {} with the input item

-0: use null character as delimiter

Example – run a script for each line:

cat file.txt | xargs -I {} ./command.sh -p {} -1

Example – count lines of C++ source files:

find source_dir/ -type f -name "*.cpp" -print0 | xargs -0 wc -l

4. sort – Sorting

Options:

-n: numeric sort (vs. -d dictionary order)

-r: reverse order

-k N: sort by the N‑th column

Examples:

sort -nrk 1 data.txt

sort -bd data   # ignore leading blanks

5. uniq – Remove Duplicate Lines

Basic usage: sort unsort.txt | uniq Count occurrences: sort unsort.txt | uniq -c Show only duplicate lines: sort unsort.txt | uniq -d Specify fields to compare (‑s start, ‑w width):

sort unsort.txt | uniq -f 2 -s 5 -w 10

6. tr – Translate or Delete Characters

General usage:

echo 12345 | tr '0-9' '9876543210'   # simple substitution

cat text | tr '\t' ' '   # tab to space

Delete characters: cat file | tr -d '0-9' # remove all digits Complement set (‑c):

cat file | tr -c '0-9'   # keep only digits

cat file | tr -d -c '0-9 
'   # delete non‑digits

Compress repeated characters (‑s): cat file | tr -s ' ' Character classes (e.g., [:lower:], [:upper:]):

tr '[:lower:]' '[:upper:]'

7. cut – Extract Columns

Extract columns 2 and 4: cut -f2,4 filename Exclude column 3: cut -f3 --complement filename Specify delimiter: cut -d ";" -f2 filename Field ranges:

N‑: from field N to end

M‑N: fields M through N

Units:

-b: bytes

-c: characters

-f: fields (delimiter‑based)

Examples:

cut -c1-5 file   # first five characters

cut -c-2 file    # first two characters

8. paste – Merge Files Linewise

Combine two files column‑wise (default delimiter is tab): paste file1 file2 Specify a different delimiter (e.g., comma):

paste file1 file2 -d ","

9. wc – Word, Line, and Byte Count

Count lines: wc -l file Count words: wc -w file Count bytes:

wc -c file

10. sed – Stream Editor for Text Substitution

Replace first occurrence on each line: sed 's/text/replace_text/' file Global replacement: sed 's/text/replace_text/g' file Edit file in place (‑i): sed -i 's/text/replace_text/g' file Delete empty lines: sed '/^$/d' file Use captured groups: sed 's/hello$[0-9]$/\1/' Variable substitution with double quotes:

p=pattern; r=replace; echo "a line with pattern" | sed "s/$p/$r/g"

11. awk – Powerful Text Processing Language

Basic script structure:

awk 'BEGIN{print "start"} {print} END{print "end"}' file

Key built‑in variables:

NR – record number (line number)

NF – number of fields

$0 – entire line

$1, $2 … – individual fields

Print specific fields: awk '{print $2, $3}' file Count lines: awk 'END{print NR}' file Sum first column: awk '{sum+=$1} END{print sum}' file Filter by line number: awk 'NR<5' file Filter by pattern: awk '/linux/' file Set field delimiter (‑F): awk -F: '{print $NF}' /etc/passwd Read command output with getline:

echo | awk '{"grep root /etc/passwd" | getline cmdout; print cmdout}'

Implement head (first 10 lines): awk 'NR<=10{print}' filename Implement tail (last 10 lines):

awk '{buf[NR%10]=$0} END{for(i=0;i<10;i++) print buf[i]}' filename

Source: 大CC, http://www.cnblogs.com/me115/p/3427319.html (originally from the public account “民工哥技术之路”).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Linux shell text processing grep awk find

Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.