Fundamentals 17 min read

Master Essential Linux Shell Text Tools: Find, Grep, Awk, Sed & More

This guide introduces the most frequently used Linux shell utilities for text processing—including find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk—explaining their core options, practical examples, and how to combine them for powerful command‑line workflows.

Efficient Ops
Efficient Ops
Efficient Ops
Master Essential Linux Shell Text Tools: Find, Grep, Awk, Sed & More

Linux Shell is a fundamental skill; despite its quirky syntax and lower readability compared to languages like Python, mastering it reveals many aspects of the Linux system and remains essential for everyday scripting.

Key tools for text handling in Linux include find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk . The examples and parameters shown are the most common and practical. The author prefers one‑line commands and advises using Python for more complex tasks.

1. find – File Search

Search for

*.txt

and

*.pdf

files:

<code>find . \( -name "*.txt" -o -name "*.pdf" \) -print</code>

Regex search:

<code>find . -regex ".*\(\.txt|\.pdf\)$"</code>

Case‑insensitive regex:

<code>find . -iregex ".*\.txt$"</code>

Negate pattern (exclude txt files):

<code>find . ! -name "*.txt" -print</code>

Limit search depth (depth = 1):

<code>find . -maxdepth 1 -type f</code>

Search by type:

<code>find . -type d -print   # directories only</code>
<code>find . -type f -print   # regular files</code>

Search by time:

atime – access time (days)

mtime – modification time

ctime – metadata change time

<code>find . -atime 7 -type f -print   # accessed in last 7 days</code>

Search by size (e.g., larger than 2 KB):

<code>find . -type f -size +2k</code>

Search by permission:

<code>find . -type f -perm 644 -print</code>

Search by owner:

<code>find . -type f -user weber -print</code>

Actions after finding:

Delete:

<code>find . -type f -name "*.swp" -delete</code>

Execute (powerful

-exec

):

<code>find . -type f -user root -exec chown weber {} \;</code>
Note: {} is replaced by each matched file name.
<code>find . -type f -mtime +10 -name "*.txt" -exec cp {} OLD \;</code>

Combine multiple commands via a script:

<code>-exec ./commands.sh {} \;</code>

Output delimiters:

Default:

'\n'
-print0

uses

'\0'

to handle spaces in file names.

2. grep – Text Search

Basic usage:

<code>grep match_pattern file</code>

Common options:

-o

output only matching part,

-v

invert match

-c

count matches

-n

show line numbers

-i

ignore case

-l

list matching file names

Recursive search (favorite for code):

<code>grep "class" . -R -n</code>

Match multiple patterns:

<code>grep -e "class" -e "virtual" file</code>

Use

-z

to output file names terminated by

\0

:

<code>grep "test" file* -lZ | xargs -0 rm</code>

3. xargs – Build Command Lines

Convert input lines to arguments, useful with

grep

or

find

:

<code>cat file.txt | xargs</code>

Convert multi‑line output to a single line:

<code>cat file.txt | xargs</code>

Convert a single line to multiple lines (e.g., three arguments per line):

<code>cat single.txt | xargs -n 3</code>

Key options:

-d

define delimiter (default space,

\n

for lines)

-n

number of arguments per command line

-I {}

replace placeholder with each argument

-0

use

\0

as delimiter

<code>find source_dir/ -type f -name "*.cpp" -print0 | xargs -0 wc -l</code>

4. sort – Sorting

Options:

-n

numeric sort,

-d

dictionary order

-r

reverse

-k N

sort by column N

<code>sort -nrk 1 data.txt</code>
<code>sort -bd data   # ignore leading blanks</code>

5. uniq – Remove Duplicate Lines

<code>sort unsort.txt | uniq</code>

Count occurrences:

<code>sort unsort.txt | uniq -c</code>

Show only duplicated lines:

<code>sort unsort.txt | uniq -d</code>

Compare specific fields with

-s

(start) and

-w

(width).

6. tr – Translate/Replace Characters

<code>echo 12345 | tr '0-9' '9876543210'   # simple cipher</code>
<code>cat text | tr '\t' ' '   # tabs to spaces</code>

Delete characters:

<code>cat file | tr -d '0-9'</code>

Complement set (

-c

):

<code>cat file | tr -c '0-9'</code>
<code>cat file | tr -d -c '0-9 \n'</code>

Compress repeated characters (

-s

), often used to squeeze spaces:

<code>cat file | tr -s ' '</code>

Character classes (e.g.,

[:lower:]

,

[:digit:]

,

[:space:]

).

<code>tr '[:lower:]' '[:upper:]'</code>

7. cut – Column Extraction

<code>cut -f2,4 filename</code>
<code>cut -f3 --complement filename   # all but column 3</code>

Specify delimiter with

-d

:

<code>cat -f2 -d ";" filename</code>

Ranges:

N-

from field N to end

-M

first M fields

N-M

fields N through M

Units:

-b

bytes

-c

characters

-f

fields (delimiter‑based)

<code>cut -c1-5 file   # first 5 characters</code>
<code>cut -c-2 file   # first 2 characters</code>

8. paste – Merge Columns

<code>paste file1 file2</code>

Change delimiter (default tab) with

-d

:

<code>paste file1 file2 -d ","</code>

9. wc – Count Lines, Words, Bytes

<code>wc -l file   # lines</code>
<code>wc -w file   # words</code>
<code>wc -c file   # bytes</code>

10. sed – Stream Editing

Replace first occurrence:

<code>sed 's/text/replace_text/' file</code>

Global replace:

<code>sed 's/text/replace_text/g' file</code>

Edit file in place:

<code>sed -i 's/text/replace_text/g' file</code>

Delete empty lines:

<code>sed '/^$/d' file</code>

Use variables and double quotes for evaluation:

<code>p=pattern; r=replace; echo "a line with $p" | sed "s/$p/$r/g"</code>

11. awk – Data‑Stream Processing

Structure:

<code>awk 'BEGIN{...} { ... } END{...}' file</code>

Built‑in variables:

NR

(record number),

NF

(field count),

$0

(whole line),

$1

,

$2

, …

<code>awk '{print $2, $3}' file</code>

Count lines:

<code>awk 'END{print NR}' file</code>

Sum first column:

<code>awk '{sum+=$1} END{print sum}' file</code>

Pass external variable:

<code>var=1000; awk -v v=$var '{print v}' file</code>

Set field separator:

<code>awk -F: '{print $NF}' /etc/passwd</code>

Read command output:

<code>awk '{"grep root /etc/passwd" | getline out; print out}'</code>

Implement

head

and

tail

:

<code>awk 'NR<=10{print}' file   # head</code>
<code>awk '{buf[NR%10]=$0} END{for(i=0;i<10;i++) print buf[i]}' file   # tail</code>

Common functions:

index()

,

sub()

,

match()

,

length()

,

printf()

.

12. Iterating Lines, Words, and Characters

While‑read loop:

<code>while read line; do echo "$line"; done < file.txt</code>

Awk alternative:

<code>cat file.txt | awk '{print}'</code>

Iterate words in a line:

<code>for word in $line; do echo $word; done</code>

Iterate characters using Bash substring syntax:

<code>for ((i=0;i<${#word};i++)); do echo ${word:i:1}; done</code>
LinuxshellCommand Linebashtext processing
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.