Fundamentals 15 min read

Master AWK: A Quick Guide to Text Processing and Scripting

This comprehensive tutorial introduces AWK, covering its origins, variants, typical use cases, workflow, program structure, command‑line options, operators, regular expressions, arrays, control flow, functions, output redirection, and how to execute shell commands, all illustrated with clear examples and diagrams.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Master AWK: A Quick Guide to Text Processing and Scripting

Overview

AWK is an interpreted programming language designed for powerful text processing. Its name comes from the surnames of its creators: Alfred Aho, Peter Weinberger, and Brian Kernighan. GNU AWK (gawk) is maintained by the Free Software Foundation and is the default on most GNU/Linux distributions.

AWK Variants

AWK – the original version from AT&T labs

NAWK – an upgraded version from AT&T

GAWK – GNU AWK, fully compatible with both AWK and NAWK

Typical Uses

Text processing

Generating formatted reports

Arithmetic calculations

String manipulation

Workflow

AWK follows a simple workflow: Read , Execute , Repeat . It reads each line from the input stream (file, pipe, or standard input), executes the specified commands, and repeats until the end of the file.

Program Structure

An AWK program may contain three optional blocks: BEGIN , BODY , and END .

BEGIN block

BEGIN { awk-commands }

Executed once before any input is processed; useful for initializing variables.

BODY block

/pattern/ { awk-commands }

Executed for each input line that matches pattern. If no pattern is given, the commands run for every line.

END block

END { awk-commands }

Executed after all input has been processed.

Basic Syntax

AWK commands can be run directly on the command line using single quotes or placed in a script file.

Standard Options

-v var=value

– assign a variable before execution --dump-variables[=file] – write sorted global variables to a file (default awkvars.out) --lint[=fatal] – warn about ambiguous or non‑portable code; fatal turns warnings into errors --posix – enforce strict POSIX compatibility --profile[=file] – write a formatted version of the program to a file (default awkprof.out) --traditional – disable all gawk extensions --version – display version information

Operators

AWK provides arithmetic, increment/decrement, assignment, relational, logical, ternary, unary, exponentiation, string‑concatenation, array‑member, and regular‑expression operators. Examples are shown below.

Regular Expressions

AWK uses ~ for match and !~ for non‑match. Regular expressions are powerful for complex text processing.

Arrays

AWK supports associative (hash) arrays with string or numeric indices. Arrays are one‑dimensional, but multidimensional structures can be simulated.

Control Flow

Control‑flow statements (if, else, switch, etc.) follow the same syntax as in C‑like languages.

Loops

Supported loops include for, while, do...while, with break, continue, and exit statements.

Functions

Built‑in Functions

AWK provides a rich set of built‑in functions for mathematics, strings, time, and bitwise operations. Examples: atan2(y, x), cos(expr), exp(expr), int(expr), log(expr), rand(), sin(expr), sqrt(expr),

srand([expr])
asort(arr[, d[, how]])

, asorti(arr[, d[, how]]), gsub(regex, sub, string), index(str, sub), length(str), match(str, regex), split(str, arr, regex), sprintf(format, ...), strtonum(str), sub(regex, sub, string), substr(str, start, len), tolower(str),

toupper(str)
systime()

, mktime(datespec),

strftime([format[, timestamp[, utc‑flag]]])
and

, compl, lshift, rshift, or,

xor

User‑Defined Functions

Functions can be defined to encapsulate reusable logic.

function name(arg1, arg2,   local1, local2) {
    # function body
}

Output Redirection

AWK’s print and printf can redirect output to files using the > operator, or to other programs using pipelines.

Pipelines

Using |&, AWK can open a two‑way pipe to another program, write to its standard input, and read from its standard output with getline. Closing the pipe with close(cmd, "to") or close(cmd, "from") is essential to avoid deadlocks.

Beautifying Output

The printf function, borrowed from C, provides powerful formatting capabilities using specifiers such as %c, %d, %s, etc.

Executing Shell Commands

Shell commands can be run from AWK using the system() function or by opening a pipe.

References

AWK Tutorial

The GNU Awk User’s Guide

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Scriptingtext processingawk
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.