Master awk: A Complete Guide to Powerful Text Processing
This article provides a comprehensive tutorial on the awk command, covering its common syntax, field printing, pattern matching, BEGIN/END blocks, separators, built‑in variables, external/internal variables, printf formatting, conditional statements, loops, arrays, and built‑in functions, with practical examples for system administration and log analysis.
This article offers a detailed introduction to the awk command, a versatile shell tool for processing text files, extracting fields, matching patterns, and performing calculations, making it ideal for system administration and log analysis.
awk Common Syntax
1. Print fields
2. Print fields after pattern match
3. BEGIN and END statements
4. Field separators
5. Variable NR
6. Use external variables and define internal variables
7. printf function
8. awk if statements
9. awk while statements
10. awk for statements
11. Arrays
12. awk built‑in functions
1. Print Fields
Reorder and print specific fields from a file.
awk '/^[^#]/{print $2, $1, $3}' /etc/fstab
/dev/mapper/rhel-root / xfs
/boot UUID=722b5e8f-6da6-4eba-abfe-8c6e7fddd67b xfs
/u01 /dev/mapper/rhel-home xfs
swap /dev/mapper/rhel-swap swap
/dev/shm tmpfs tmpfsPrint the number of fields in each record.
awk '/^[^#]/{print "Record:", NR, "has", NF, "fields."}' /etc/fstab2. Print Fields After Pattern Match
Print fields from lines that match a specific pattern.
awk '/swap/{print $2, $1, $3}' /etc/fstab
swap /dev/mapper/rhel-swap swapMatch a case number pattern and print the line.
echo "1-2345598" | awk '/^[0-9]+$/ || /^[0-9]-[0-9]+$/ {print}'Match multiple conditions using the OR operator.
netstat -an | awk '/CLOSE_WAIT|ESTABLISHED|TIME_WAIT|LISTEN/ {print $0}'3. BEGIN and END Statements
BEGIN runs before processing any input, often to print headers. END runs after all input has been processed, useful for summaries.
awk 'BEGIN {print "Begin..."} /#[H-L]o/ {print $0} /#P[e-o]/ {print $0} END {print "The End"}' /etc/ssh/sshd_config
Begin...
#Port 22
#HostKey /etc/ssh/ssh_host_dsa_key
#LogLevel INFO
#LoginGraceTime 2m
#HostbasedAuthentication no
#PermitEmptyPasswords no
#PermitTTY yes
#PermitUserEnvironment no
#PidFile /var/run/sshd.pid
#PermitTunnel no
The End4. Field Separators
1) Input Separator
The variable FS defines the input field separator; the default is space or tab. Use -F or set FS in a BEGIN block.
awk -F: '{print $3}' /etc/passwd | sort -n | tail -1Example with multiple separators:
cat file2
root /:root:/bin/sh
grid /u01/app:oinstall:/bin/bash
oracle /u01/app/oracle:dba:/bin/ksh
awk -F"[ :]" '{print $1, $3}' file2
root root
grid oinstall
oracle dba2) Output Separator
The variable OFS defines the output field separator; the default is a space.
awk 'BEGIN{FS="[ :]");OFS="\t"} {print $1,$2,$3,$4}' file2
root / root /bin/sh
grid /u01/app oinstall /bin/bash
oracle /u01/app/oracle dba /bin/ksh5. Variable NR
NR holds the current record number. Example counting lines in /etc/fstab:
grep -E -v "^#|^[ ]*$" /etc/fstab | awk '{print $2, $1} END {print "The number of lines of fstab is " NR}'
/ /dev/mapper/rhel-root
/boot UUID=722b5e8f-6da6-4eba-abfe-8c6e7fddd67b
/u01 /dev/mapper/rhel-home
swap /dev/mapper/rhel-swap
/dev/shm tmpfs
The number of lines of fstab is 56. Use External and Internal Variables
Reference an external shell variable inside awk:
var="priv1"
awk '/'"$var"'/{print $3}' /etc/hosts
rac1-priv1
rac2-priv1Define an internal counter variable:
awk '/^[^#]/{counter=counter+1;print} END {print "The number of lines of fstab is " counter "."}' /etc/fstab
... (output lines) ...
The number of lines of fstab is 5.Accumulate values from a column:
awk '{total=total+$2; print "Field 2 = " $2} END {print "Total=" total}' file3
Field 2 = 100000
Field 2 = 120000
... (other lines) ...
Total=4500007. printf Function
Formatted output using printf.
awk '/^[^#]/{printf("%-45s %-19s %-9s %-20s %-1d %1d
", $1,$2,$3,$4,$5,$6)}' /etc/fstab
/dev/mapper/rhel-root / xfs defaults 0 0
UUID=722b5e8f-6da6-4eba-abfe-8c6e7fddd67b /boot xfs defaults 0 0
/dev/mapper/rhel-home /u01 xfs defaults 0 0
/dev/mapper/rhel-swap swap swap defaults 0 0
tmpfs /dev/shm tmpfs defaults,size=21G 0 08. awk if Statements
Conditional processing with if, else, and nested conditions.
1) Numeric Comparison
awk '{num= $2 / $3; if(num>5) print $1,num}' file3
east 10
north 13.3333
west 7.14286
south 102) String Comparison
awk '{if($1 ~ /north/) print $1, $3}' file3
north 9000
northwest 6000
northeast 50003) Logical Operations
awk '{if($1 ~ /north/ && $3 > 5000) print $1, $3}' file3
north 9000
northwest 60009. awk while Loop
Iterate over fields using a while loop.
awk '{i=1}; {while (i <= NF) {print $i; i++}}' file4
sdb
sdc
sde
sdf
g10. awk for Loop
Iterate over fields using a for loop and print lines in reverse order.
awk '{for(i=1;i<=NF;i++)print $i}' file4
sdb
sdc
sde
sdf
g cat file3
east 100000 10000
north 120000 9000
... (other lines) ...
awk '{line[NR]=$0}; END{for(c=NR;c>0;c--)print line[c]}' file3
southwest 20000 4000
southeast 30000 6000
south 80000 8000
west 50000 7000
... (remaining lines) ...11. Arrays
Arrays can be indexed by strings and do not need prior declaration.
Count occurrences of values in a column:
awk '{a[$3]++} END{for(i in a) print i, a[i]}' file3
10000 1
7000 1
8000 1
4000 1
9000 1
5000 2
6000 212. awk Built‑in Functions
1) String Functions
Examples of gsub, sub, index, length, match, split, substr, tolower, toupper:
awk '{print toupper(substr($1,0,5))}' file3
EAST
NORTH
NORTH
NORTH
WEST
SOUTH
SOUTH
SOUTH
CENTE awk -F "[ :]" '{sub("/bin/","",$4); print $4}' file2
sh
bash
ksh2) I/O Functions
Close files, read lines, redirect output, and execute system commands.
awk -F: '{cmd="grep "$1" /etc/group"; system(cmd)}' oracleuser | sort -u
asmadmin:x:504:grid
asmdba:x:506:grid,oracle
asmoper:x:507:grid
dba:x:502:oracle,grid
oracle:x:1000:3) Math Functions
Examples of atan2, cos, exp, int, log, rand, sin, sqrt, srand:
awk '{print atan2($2,$1)}' file5
1.10715Note: The tutorial portion ends here; promotional material and QR‑code sections have been omitted.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
