Fundamentals 15 min read

Master Linux Text Processing: Essential Shell Tools and Practical Examples

This article provides a comprehensive guide to the most commonly used Linux shell utilities for text manipulation—find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, and awk—offering clear explanations, typical parameters, and real‑world command examples to help you handle files efficiently.

ITPUB
ITPUB
ITPUB
Master Linux Text Processing: Essential Shell Tools and Practical Examples

This guide introduces the essential Linux command‑line tools for processing text, presenting the most useful options and practical examples for each utility.

find – File Search

Search for txt and pdf files: find . \( -name "*.txt" -o -name "*.pdf" \) -print Regex search for .txt or .pdf: find . -regex ".*\(\.txt|\.pdf\)$" Case‑insensitive regex: find . -iregex ".*\.txt$" Exclude txt files: find . ! -name "*.txt" -print Limit depth to current directory (depth 1): find . -maxdepth 1 -type f Search by type (directories only): find . -type d -print Search by modification time (last 7 days): find . -atime 7 -type f -print Search by size (>2k): find . -type f -size +2k Search by permission (e.g., 644): find . -type f -perm 644 -print Search by owner: find . -type f -user weber -print Delete all .swp files in the current directory: find . -type f -name "*.swp" -delete Execute a command on each match (change ownership): find . -type f -user root -exec chown weber {} \; Copy files older than 10 days to another directory: find . -type f -mtime +10 -name "*.txt" -exec cp {} OLD \; Run a custom script on each match: find . -type f -name "*.log" -exec ./process.sh {} \; Use \0 as delimiter for filenames containing spaces:

find . -print0

grep – Text Search

Count matching lines: grep -c "text" filename Show line numbers: grep -n "pattern" file Case‑insensitive search: grep -i "pattern" file Print only file names with matches: grep -l "pattern" * Recursive search in directories: grep "class" . -R -n Match multiple patterns: grep -e "class" -e "virtual" file Delete files whose names contain a pattern (using \0 as delimiter):

grep "test" file* -lZ | xargs -0 rm

xargs – Argument Builder

Convert multiline output to a single line: cat file.txt | xargs Convert a single line into multiple lines (3 arguments per line): cat single.txt | xargs -n 3 Specify a custom delimiter (default space, newline is \n): xargs -d "," command Replace placeholder {} in a command: cat file.txt | xargs -I {} ./command.sh -p {} -1 Use \0 as input delimiter (useful with find -print0):

find source_dir/ -type f -name "*.cpp" -print0 | xargs -0 wc -l

sort – Sorting

Numeric reverse sort on column 1: sort -nrk 1 data.txt Ignore leading blanks:

sort -bd data

uniq – Remove Duplicate Lines

Delete duplicate lines: sort unsort.txt | uniq Count occurrences of each line: sort unsort.txt | uniq -c Show only duplicated lines: sort unsort.txt | uniq -d Compare specific fields (e.g., start at column 2, compare 5 characters):

uniq -s 2 -w 5 file

tr – Translate / Delete Characters

Simple character substitution (e.g., ROT‑10): echo 12345 | tr '0-9' '9876543210' Convert tabs to spaces: cat text | tr '\t' ' ' Delete all digits: cat file | tr -d '0-9' Keep only digits (complement): cat file | tr -c '0-9' Compress repeated spaces: cat file | tr -s ' ' Character classes (e.g., lower‑to‑upper):

tr '[:lower:]' '[:upper:]'

cut – Column Extraction

Extract fields 2 and 4: cut -f2,4 filename Exclude field 3: cut -f3 --complement filename Specify delimiter (semicolon): cat -f2 -d ";" filename Byte‑wise extraction (first 5 bytes): cut -c1-5 file First two characters:

cut -c-2 file

paste – Merge Columns

Combine two files side‑by‑side (default tab delimiter): paste file1 file2 Use a custom delimiter (comma):

paste file1 file2 -d ","

wc – Count Lines, Words, Bytes

Line count: wc -l file Word count: wc -w file Byte/character count:

wc -c file

sed – Stream Editor

Replace first occurrence on each line: sed 's/text/replace_text/' file Global replacement: sed 's/text/replace_text/g' file Edit file in place: sed -i 's/text/replace_text/g' file Delete empty lines: sed '/^$/d' file Use captured groups: sed 's/hello\([0-9]\)/\1/' Insert a character after the first three characters:

sed 's/^.{3}/&\//g' file

awk – Data‑Stream Processing

Basic script structure: awk 'BEGIN{...} { ... } END{...}' file Print current line: awk '{print}' file Print specific fields: awk '{print $2, $3}' file Count lines: awk 'END{print NR}' file Sum values in the first column: awk '{sum+=$1} END{print sum}' file Pass external variables: var=1000; awk '{print $1}' var=$var file Filter by line number or pattern:

awk 'NR<5' file
awk '/linux/' file

Set field delimiter (e.g., colon): awk -F: '{print $NF}' /etc/passwd Read command output inside awk: awk '{ "grep root /etc/passwd" | getline cmd; print cmd }' Implement head/tail:

awk 'NR<=10{print}' file
awk '{buf[NR%10]=$0} END{for(i=0;i<10;i++) print buf[i]}' file

Print a range of lines (4‑6): awk 'NR==4,NR==6' file Print between two patterns: awk '/start_pattern/,/end_pattern/' file Common built‑in functions: index, sub, match, length.

The article is a concise reference derived from the book “Linux Shell Script Guide”, offering ready‑to‑use command snippets for everyday text‑processing tasks.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Shelltext processingGrepawkfind
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.