Operations 22 min read

Master the Linux ‘Three Musketeers’: grep, sed, and awk Explained

This guide introduces the Linux “three musketeers” – grep, sed, and awk – covering regular expression fundamentals, command syntax, options, and practical examples, enabling readers to efficiently search, edit, and process text files while mastering essential shell scripting techniques.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Master the Linux ‘Three Musketeers’: grep, sed, and awk Explained

Linux Three Musketeers Overview

The “Linux three musketeers” refer to grep, sed, and awk. Mastering these tools can greatly improve operational efficiency. They rely on regular expressions, with Linux supporting both basic and extended regex. After mastering regex, the usage of the three tools is explained.

1. Regular Expressions

Regular expressions (REGEXP) are pattern templates used to match specific text. Proficiency in regex is a prerequisite for using the Linux three musketeers.

Metacharacters

.

: matches any single character []: matches any single character within the specified set [^]: matches any single character not in the specified set

Character Classes

[[:digit:]]

: matches a single digit [[:lower:]]: matches a single lowercase letter [[:upper:]]: matches a single uppercase letter [[:punct:]]: matches a single punctuation character [[:space:]]: matches a single whitespace character [[:alpha:]]: matches a single alphabetic character [[:alnum:]]: matches a single alphanumeric character

Quantifiers (greedy mode)

*

: matches the preceding element zero or more times ?: matches the preceding element zero or one time +: matches the preceding element one or more times .*: matches any length of any characters

Anchors

^

: anchors the match to the start of a line $: anchors the match to the end of a line ^$: matches an empty line

Linux-specific escaping

Because the shell interprets some metacharacters, they must be escaped with a backslash, e.g., \?, \+, \{m,n\}, \{1,\}, \{0,3\}.

Note: At least zero occurrences must be explicitly written.
\< or \b

: anchors the start of a word \> or \b: anchors the end of a word

Grouping and Backreferences

\(\)

: defines a group \1, \2, …: refer to the content captured by the corresponding group.

2. Extended Regular Expressions

Standard regex requires many escaped symbols, which is inconvenient. Extended regex reduces the need for escaping, especially useful in sed scripts.

Character Matching

.

: matches a single character [abc]: matches any one of a, b, or c [^abc]: matches any character except a, b, or c

Quantifiers (no extra escaping needed)

*

, ?, +, {m,n} work as described above.

Anchors

Use ^ and $ as before. For word boundaries, use escaped \< and \>.

Alternation

|

: matches either the expression on its left or right.

Note: C|cat matches the whole word “C” or “cat”.

Using extended regex simplifies sed commands and improves readability.

3. grep Family

3.1 grep Commands

grep, egrep, and fgrep are three subcommands for different scenarios.

grep : uses standard regex.

egrep : equivalent to grep -E, uses extended regex.

fgrep : a simplified version that does not support regex but is faster and uses fewer resources.

3.2 Usage

Syntax grep [options] PATTERN [FILE...] Options -i: ignore case --color: highlight matches -v: show lines that do not match the pattern -o: show only the matching part -E: use extended regex (same as egrep)

PATTERN : can be a plain string or a regex (basic or extended). FILE : files to search.

4. sed Command

4.1 Overview

sed (Stream Editor) is a powerful line‑oriented editor.

4.2 Basic Syntax

sed [option] 'script' [input file]...

1. Option Part

-n

: suppress automatic printing of lines that do not match. -e: specify multiple scripts. -f: read script from a file. -r: enable extended regex. -i: edit files in place.

2. Script Part

Script consists of an address (range) and an operation (e.g., substitute, insert, delete).

a) Address – empty (whole file)

Applies to the entire file.

b) Address – single line

n

: operate on line n. /pattern/: operate on lines matching pattern (basic regex unless -r is used).

c) Address – range

n,m

: from line n to line m inclusive. n,+k: from line n plus the next k lines. n,/pattern/: from line n to the next line matching pattern. /pattern1/,/pattern2/: from first occurrence of pattern1 to first occurrence of pattern2.

d) Address – step

1~2

: every odd line. 2~2: every even line.

e) Editing Operations

d

: delete line. p: print pattern space. a: append text after the addressed line (use \n for multiple lines). i: insert text before the addressed line. c: replace addressed line with new text. w: write matched lines to a file. r: read a file and insert its contents after the addressed line. !: apply command to lines that do NOT match the address. s///: substitute; the delimiter after s can be any non‑alphanumeric character to avoid escaping.

Replacement flags: g for global replace, p to display replaced lines.

Example: echo "/var/log/messages" | sed 's@[^/]+$/\?@@' removes the filename, leaving the directory path.

4.3 Advanced Usage

1. Pattern Space and Hold Space

The pattern space holds the current line; the hold space is a temporary buffer.

2. Related Commands

h

: copy pattern space to hold space. H: append pattern space to hold space. g: copy hold space to pattern space. G: append hold space to pattern space. x: exchange pattern and hold spaces. n: read next line into pattern space (overwrites). N: append next line to pattern space. d: delete pattern space. D: delete up to first newline in pattern space.

3. Examples

sed -n 'n;p' FILE

: display even lines. sed '1!G;h;$!d' FILE: reverse file content. sed '$!d' FILE: print the last line. sed '\$!N;$!D' FILE: print the last two lines. sed '/^$/d;G' FILE: delete blank lines and add a blank line after each non‑blank line. sed 'G' FILE: add a blank line after every line. sed 'n;d' FILE: display odd lines.

5. awk Command

5.1 Overview

awk is a report generator named after its three authors. It processes input line by line, splits fields, and executes actions.

5.2 Basic Usage

1. Syntax

awk [option] 'PATTERN{ACTION}' FILE

Fields are accessed as $0, $1, $2, etc.

2. Common Options

-F

: input field separator. -v: define a variable ( var=value).

3. Patterns

/pattern/

: lines matching regex. ! /pattern/: lines not matching. NR>2: line number condition. BEGIN{...}: executed before processing. END{...}: executed after processing.

4. Built‑in Variables

FS : input field separator (default whitespace). OFS : output field separator. RS : input record separator (default newline). ORS : output record separator. NF : number of fields in the current record. NR : total number of records processed. FNR : record number in the current file. FILENAME : name of the current file. ARGC , ARGV : command‑line arguments.

5. Common Actions

print : output items separated by OFS.

printf : formatted output.

Control statements: if, while, for, break, continue.

Arrays for counting, e.g., ip[$1]++.

6. Examples

Print users whose shell is /bin/bash: awk -F: '$NF=="/bin/bash" {print $1, $NF}' /etc/passwd Count occurrences of the first column: awk '{ip[$1]++} END{for(i in ip) print i, ip[i]}' access.log Sum a numeric field for rows where the second column is between 30 and 90:

awk -F: '$2>=30 && $2<=90 {dic[$1]+=$3} END{for(i in dic) print i, dic[i]}' data

These tools form the core of Linux text processing and are essential for efficient system administration and data manipulation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

regular expressionstext processingShell scriptingGrepawksed
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.