Fundamentals 13 min read

Master awk: From Basics to Advanced Text Processing with Real Examples

This article provides a comprehensive guide to awk, covering its origins, syntax, options, keywords, operators, built‑in variables, regular‑expression meta‑characters, functions, control flow, system interaction, and practical examples that help readers efficiently process and analyze text data on the command line.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Master awk: From Basics to Advanced Text Processing with Real Examples

Introduction

awk is a powerful text‑analysis tool whose name comes from the initials of its creators Alfred Aho, Peter Weinberger and Brian Kernighan. It provides its own programming language for scanning and processing data, allowing you to read files, sort, compute, and generate reports.

Syntax Structure

The basic command line form is awk [options] script file. Important options include: -F to set the field separator. -v to assign a variable. -f to read the program from a file.

Keywords

Two special pattern blocks are BEGIN (executed once before processing) and END (executed after all input has been processed).

Test Data

A sample data file mi_info is used throughout the article; lines beginning with # are comments and not part of the input.

Basic Examples

Simple one‑liner commands demonstrate pattern matching, field selection and printing, for example:

awk '/2499/' mi_info
awk '$5=="256G"' mi_info
awk '$1 ~ "note" {print}' mi_info

Using -f and -v

Complex scripts can be stored in a file and invoked with -f. Variables can be passed from the command line with -v:

awk -v test="price is" '/note/ {print $1, test, $NF}' mi_info

Operators (precedence high to low)

++ -- (increment/decrement)

^ ** (exponentiation, right‑associative)

! + - (logical NOT, unary plus/minus)

* / % (multiply, divide, modulo)

+ - (addition, subtraction)

< <= == != > >= (comparisons)

&& (logical AND)

|| (logical OR)

?: (ternary conditional)

= += -= *= /= %= ^= **= (assignment, right‑associative)

Built‑in Variables

$n

– nth field, $0 – whole record. ARGC, ARGV – command‑line argument count and array. FILENAME, FS, OFS, RS, ORS – file and record separators. NF, NR, FNR – field count, record number, file‑relative record number. IGNORECASE – toggle case‑insensitive matching.

Regular‑Expression Metacharacters

^ (start of line), $ (end of line), . (any character), * + ? (quantifiers), [] (character class), [^] (negated class), | (alternation), () (grouping).

Built‑in Functions

sub()

, gsub() – substitution. index(), length(), substr(), match(), split() – string handling.

Arithmetic functions: atan2(), cos(), exp(), log(), sin(), sqrt(), int(), rand(), srand().

Control Flow

Awk supports if … else, while, for, break, continue, next (skip to next record) and exit (terminate processing, optional status).

System Interaction

Commands such as redirection ( >>), pipelines ( |), the system() function and printf allow awk scripts to interact with the shell and format output.

Summary and Best Practices

For complex logic, place the script in a file and invoke it with -f to avoid typing errors.

Leverage built‑in variables and system calls to keep scripts concise.

When processing large files, review and optimise the script to improve performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

regextext processingShell scriptingawk
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.