Mastering Awk: Powerful Text Processing for Linux with Real‑World Examples
This tutorial introduces awk as a versatile Linux text‑analysis tool, explains its execution model (BEGIN, body, END), demonstrates practical commands for reporting, filtering, formatting, and advanced scripting, and provides numerous code snippets and visual examples to help readers quickly apply awk in real‑world scenarios.
What Awk Can Do
Awk is a powerful text‑analysis utility that excels at generating reports, parsing system logs, counting data such as website visits, aggregating system information, and supporting loops, conditionals, and arrays for complex data processing.
Awk Execution Model
Awk processes input in three stages: BEGIN – runs commands before any input is read. body – executed for each line (record) of input. END – runs after all input has been processed.
Each line is split into fields (columns) using a delimiter (default whitespace). The record separator is \n.
Basic Command Syntax
The basic awk command format is illustrated below:
Awk scripts are enclosed in single quotes. $1..$N refers to specific columns, while $0 represents the entire line.
Practical – Beginner
Save sample data to file.txt and run a simple awk command to print columns 1, 4, and 8:
Fields are accessed with $1, $4, etc. Awk’s printf supports C‑style formatting (e.g., %s for strings, -4 for left‑aligned width 4).
Practical – Intermediate
Filtering Records – output only lines where column 3 equals root and column 6 equals 10:
Awk supports comparison operators !=, >, <, >=, <=. $0 denotes the whole line.
Built‑in Variables – NR (current record number) and NF (field count) are useful for tracking line numbers and column counts.
Specifying Delimiters – change the input field separator with FS or the -F option, and set the output field separator with OFS:
Practical – Advanced
Conditional Matching – list all files owned by root or match lines containing root using regular expressions ( /root/) or multiple patterns ( /Aug|Dec/).
Splitting Files – redirect output to separate files based on a field (e.g., month in column 5) using the > operator.
If Statements – complex conditions are placed inside braces; remember if must be inside the {} block.
Statistics – sum file sizes of *.c and *.h files, or compute per‑user memory usage from the RSS column using arrays and for loops.
Comprehensive Example – Student Grades
A full‑featured awk script ( cal.awk) processes a grade file, using BEGIN to print headers, body to accumulate scores, and END to output totals and averages.
Key Built‑in Variables
NR: current line number. NF: number of fields in the current line. RS: record separator (default newline). FS: field separator (default space/tab). OFS: output field separator (default space). ORS: output record separator (default newline).
Formatting Output
Use printf with familiar C format specifiers ( %d, %u, %f, %s, %c, %e, %x, %g, \n, \t).
Programming Constructs
Conditional statements (if/else).
Loops (while, for).
Arrays (associative, similar to maps).
Functions (built‑in and user‑defined).
Common String Functions
index(s, t): position of substring t in s. length(s): length of s. split(s, a, sep): split s into array a using sep. substr(s, p, n): substring of s starting at p with length n. tolower(s) / toupper(s): case conversion.
This guide provides a concise yet comprehensive overview of awk’s core concepts, syntax, and practical usage, enabling readers to harness awk for efficient text processing and data analysis on Linux.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
