Operations 21 min read

Master awk: A Complete Guide to Powerful Text Processing

This article provides a comprehensive tutorial on the awk command, covering its common syntax, field printing, pattern matching, BEGIN/END blocks, separators, built‑in variables, external/internal variables, printf formatting, conditional statements, loops, arrays, and built‑in functions, with practical examples for system administration and log analysis.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Master awk: A Complete Guide to Powerful Text Processing

This article offers a detailed introduction to the awk command, a versatile shell tool for processing text files, extracting fields, matching patterns, and performing calculations, making it ideal for system administration and log analysis.

awk Common Syntax

1. Print fields

2. Print fields after pattern match

3. BEGIN and END statements

4. Field separators

5. Variable NR

6. Use external variables and define internal variables

7. printf function

8. awk if statements

9. awk while statements

10. awk for statements

11. Arrays

12. awk built‑in functions

1. Print Fields

Reorder and print specific fields from a file.

awk '/^[^#]/{print $2, $1, $3}' /etc/fstab
/dev/mapper/rhel-root / xfs
/boot UUID=722b5e8f-6da6-4eba-abfe-8c6e7fddd67b xfs
/u01 /dev/mapper/rhel-home xfs
swap /dev/mapper/rhel-swap swap
/dev/shm tmpfs tmpfs

Print the number of fields in each record.

awk '/^[^#]/{print "Record:", NR, "has", NF, "fields."}' /etc/fstab

2. Print Fields After Pattern Match

Print fields from lines that match a specific pattern.

awk '/swap/{print $2, $1, $3}' /etc/fstab
swap /dev/mapper/rhel-swap swap

Match a case number pattern and print the line.

echo "1-2345598" | awk '/^[0-9]+$/ || /^[0-9]-[0-9]+$/ {print}'

Match multiple conditions using the OR operator.

netstat -an | awk '/CLOSE_WAIT|ESTABLISHED|TIME_WAIT|LISTEN/ {print $0}'

3. BEGIN and END Statements

BEGIN runs before processing any input, often to print headers. END runs after all input has been processed, useful for summaries.

awk 'BEGIN {print "Begin..."} /#[H-L]o/ {print $0} /#P[e-o]/ {print $0} END {print "The End"}' /etc/ssh/sshd_config
Begin...
#Port 22
#HostKey /etc/ssh/ssh_host_dsa_key
#LogLevel INFO
#LoginGraceTime 2m
#HostbasedAuthentication no
#PermitEmptyPasswords no
#PermitTTY yes
#PermitUserEnvironment no
#PidFile /var/run/sshd.pid
#PermitTunnel no
The End

4. Field Separators

1) Input Separator

The variable FS defines the input field separator; the default is space or tab. Use -F or set FS in a BEGIN block.

awk -F: '{print $3}' /etc/passwd | sort -n | tail -1

Example with multiple separators:

cat file2
root /:root:/bin/sh
grid /u01/app:oinstall:/bin/bash
oracle /u01/app/oracle:dba:/bin/ksh
awk -F"[ :]" '{print $1, $3}' file2
root root
grid oinstall
oracle dba

2) Output Separator

The variable OFS defines the output field separator; the default is a space.

awk 'BEGIN{FS="[ :]");OFS="\t"} {print $1,$2,$3,$4}' file2
root	/	root	/bin/sh
grid	/u01/app	oinstall	/bin/bash
oracle	/u01/app/oracle	dba	/bin/ksh

5. Variable NR

NR holds the current record number. Example counting lines in /etc/fstab:

grep -E -v "^#|^[ ]*$" /etc/fstab | awk '{print $2, $1} END {print "The number of lines of fstab is " NR}'
/ /dev/mapper/rhel-root
/boot UUID=722b5e8f-6da6-4eba-abfe-8c6e7fddd67b
/u01 /dev/mapper/rhel-home
swap /dev/mapper/rhel-swap
/dev/shm tmpfs
The number of lines of fstab is 5

6. Use External and Internal Variables

Reference an external shell variable inside awk:

var="priv1"
awk '/'"$var"'/{print $3}' /etc/hosts
rac1-priv1
rac2-priv1

Define an internal counter variable:

awk '/^[^#]/{counter=counter+1;print} END {print "The number of lines of fstab is " counter "."}' /etc/fstab
... (output lines) ...
The number of lines of fstab is 5.

Accumulate values from a column:

awk '{total=total+$2; print "Field 2 = " $2} END {print "Total=" total}' file3
Field 2 = 100000
Field 2 = 120000
... (other lines) ...
Total=450000

7. printf Function

Formatted output using printf.

awk '/^[^#]/{printf("%-45s %-19s %-9s %-20s %-1d %1d 
", $1,$2,$3,$4,$5,$6)}' /etc/fstab
/dev/mapper/rhel-root           /               xfs       defaults               0 0
UUID=722b5e8f-6da6-4eba-abfe-8c6e7fddd67b /boot           xfs       defaults               0 0
/dev/mapper/rhel-home           /u01            xfs       defaults               0 0
/dev/mapper/rhel-swap           swap            swap      defaults               0 0
tmpfs                           /dev/shm        tmpfs     defaults,size=21G      0 0

8. awk if Statements

Conditional processing with if, else, and nested conditions.

1) Numeric Comparison

awk '{num= $2 / $3; if(num>5) print $1,num}' file3
east 10
north 13.3333
west 7.14286
south 10

2) String Comparison

awk '{if($1 ~ /north/) print $1, $3}' file3
north 9000
northwest 6000
northeast 5000

3) Logical Operations

awk '{if($1 ~ /north/ && $3 > 5000) print $1, $3}' file3
north 9000
northwest 6000

9. awk while Loop

Iterate over fields using a while loop.

awk '{i=1}; {while (i <= NF) {print $i; i++}}' file4
sdb
sdc
sde
sdf
g

10. awk for Loop

Iterate over fields using a for loop and print lines in reverse order.

awk '{for(i=1;i<=NF;i++)print $i}' file4
sdb
sdc
sde
sdf
g
cat file3
east 100000 10000
north 120000 9000
... (other lines) ...
awk '{line[NR]=$0}; END{for(c=NR;c>0;c--)print line[c]}' file3
southwest 20000 4000
southeast 30000 6000
south 80000 8000
west 50000 7000
... (remaining lines) ...

11. Arrays

Arrays can be indexed by strings and do not need prior declaration.

Count occurrences of values in a column:

awk '{a[$3]++} END{for(i in a) print i, a[i]}' file3
10000 1
7000 1
8000 1
4000 1
9000 1
5000 2
6000 2

12. awk Built‑in Functions

1) String Functions

Examples of gsub, sub, index, length, match, split, substr, tolower, toupper:

awk '{print toupper(substr($1,0,5))}' file3
EAST
NORTH
NORTH
NORTH
WEST
SOUTH
SOUTH
SOUTH
CENTE
awk -F "[ :]" '{sub("/bin/","",$4); print $4}' file2
sh
bash
ksh

2) I/O Functions

Close files, read lines, redirect output, and execute system commands.

awk -F: '{cmd="grep "$1" /etc/group"; system(cmd)}' oracleuser | sort -u
asmadmin:x:504:grid
asmdba:x:506:grid,oracle
asmoper:x:507:grid
dba:x:502:oracle,grid
oracle:x:1000:

3) Math Functions

Examples of atan2, cos, exp, int, log, rand, sin, sqrt, srand:

awk '{print atan2($2,$1)}' file5
1.10715

Note: The tutorial portion ends here; promotional material and QR‑code sections have been omitted.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

text processingShell scriptingawk
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.