Master Batch Text Processing with awk and sed: A Practical Guide for Sysadmins
This article walks through the fundamentals and advanced techniques of using awk and sed on Linux for batch text manipulation, covering field handling, custom delimiters, BEGIN/END blocks, conditional filtering, arrays, built‑in functions, real‑world Nginx log analysis, script creation, performance tips, common pitfalls, debugging tricks, and how to combine both tools for powerful pipelines.
Background and Applicable Scenarios
System administration frequently involves processing text files such as logs, configuration files, and CSV data. Manual line‑by‑line edits are time‑consuming and error‑prone. The GNU versions of awk and sed, which are bundled with most Linux distributions, provide a powerful way to automate the majority of batch text operations.
Extract specific fields (e.g., IP address, error code, response time) from large log files.
Perform bulk replacements across many servers (e.g., change a configuration parameter).
Convert formats, such as turning CSV into an HTML table.
Statistically analyse logs, for example counting requests per IP.
The article assumes a GNU environment; BSD/macOS versions have slight syntax differences.
awk Practical Guide
Understanding the Core Mechanics
awkreads a file line by line, splitting each line into fields based on whitespace by default. The special variables are: $0 – the entire line. $1, $2, … – the first, second, etc., fields. $NF – the last field; NF holds the number of fields.
# Print the first and second columns of /etc/hosts
awk '{print $1, $2}' /etc/hostsAll later awk examples build on this rule.
Custom Field Separators
Use -F to specify a delimiter. Examples:
# Count calls per API in a CSV log (fields: time,api,status,time)
awk -F',' '{print $2}' access.log | sort | uniq -c | sort -rn
# Use a pipe as delimiter
awk -F'|' '{print $1, $3}' config.txt
# Accept either comma or semicolon
awk -F'[,;]' '{print $2}' mixed.txt
# Change output separator with OFS
awk -F':' 'OFS="->" {print $1, $NF}' /etc/passwd | head -5The output field separator OFS defaults to a space.
BEGIN and END Blocks
BEGINruns before any input is read (useful for initializing variables or printing headers). END runs after all lines are processed (ideal for summarising results).
# Add a header and count failed passwords per IP
awk 'BEGIN {print "IP\tCount\tStatus"} /Failed password/ {ip[$11]++} END {for (i in ip) print i, ip[i]}' /var/log/auth.log | sort -k2 -rn
# Print total number of lines
awk 'END {print NR}' access.log
# Initialise accumulator and sum response times
awk 'BEGIN {sum=0} {sum+=$5} END {print "Total response time:", sum}' access.log NRis the current line number; FNR is the line number within the current file when processing multiple files.
Conditional Filtering
Place conditions inside //. Only lines that satisfy the condition are processed.
# Process only status code 500
awk '$9 == 500 {print $1, $7, $9}' access.log
# Requests with response time > 5 seconds (assuming $NF is response time)
awk '$NF > 5 && $NF ~ /^[0-9.]+$/ {print $1, $7, $NF}' access.log
# Combine conditions: status 500 and response time > 10 seconds
awk '$9 == 500 && $NF > 10 && $NF ~ /^[0-9.]+$/ {print $0}' access.log
# Regex match: URL contains /api/login
awk '/\/api\/login/ {print $1, $7}' access.log
# Field regex match: second field contains "error"
awk -F',' '$2 ~ /error/ {print $1, $3}' error.log
# Negate: second field does NOT contain "error"
awk -F',' '$2 !~ /error/ {print $1, $3}' error.logArrays and Loops
awkassociative arrays are extremely useful for counting by key without prior declaration.
# Count requests per IP
awk '{ip[$1]++} END {for (k in ip) print ip[k], k}' access.log | sort -rn | head -20
# Average response time per API
awk -F',' '{api[$2]++; total[$2]+=$4} END {for (k in api) {avg=total[k]/api[k]; printf "%s: calls %d, avg %.2fms
", k, api[k], avg}}' access.log
# Two‑dimensional array: daily calls per API
awk -F'[, ]' '{date=$4; api=$7; stats[date,api]++} END {for (key in stats) {split(key, parts, SUBSEP); print parts[1], parts[2], stats[key]}}' access.logThe default two‑dimensional key separator SUBSEP is \034.
Built‑in Functions
Common string and math functions:
# Upper‑/lower‑case conversion
awk '{print toupper($1), tolower($2)}' file.txt
# Substring extraction (first 10 characters)
awk '{print substr($1,1,10)}' file.txt
# Global substitution (similar to sed's s///g)
awk '{gsub(/error/, "ERROR"); print}' error.log
# Split a field into an array
awk '{n=split($0, parts, "/"); print "Fields:", n; for(i=1;i<=n;i++) print i, parts[i]}' <<< "a/b/c/d"
# Index of a substring
awk 'BEGIN {print index("hello world", "world")}'
# Regex match with position
awk 'BEGIN {if (match("error 1234 at line 50", /[0-9]+/)) print RSTART, RLENGTH}'
# Formatted output
awk '{printf "%s\t%.2f\t%06d
", $1, $2, $3}' data.txt
# Math functions
awk 'BEGIN {print sqrt(144); print int(3.7); print rand(); srand(); print sin(3.14159/2); print log(exp(1))}'Real‑World Example: Extracting Key Metrics from Nginx Access Logs
The standard combined log format fields are:
$1 = $remote_addr # client IP
$2 = $remote_user # authenticated user
$3 = $time_local # timestamp
$4 = $request # full request line
$5 = $status # HTTP status code
$6 = $body_bytes_sent
$7 = $http_referer
$8 = $http_user_agentIf the log includes $request_time, it appears as the last field ( $NF).
# 1. Requests per minute (peak traffic)
awk '{print $4}' access.log | \
sed 's/\[//;s/\+0800//' | \
awk '{print substr($1,1,16)}' | \
sort | uniq -c | sort -rn | head -20
# 2. HTTP status distribution
awk '{status[$9]++} END {for (s in status) printf "Status %s: %d (%.1f%%)
", s, status[s], status[s]*100/NR}' access.log | sort -rn
# 3. Slow requests (>3 s)
awk 'NF>=9 && $NF ~ /^[0-9.]+$/ && $NF>3 {time=substr($4,2,21); url=$7; dur=$NF; ip=$1; printf "[%s] %s %s response %.1fs
", time, ip, url, dur}' access.log | sort -t'[' -k2 -rn | head -30
# 4. Per‑IP QPS and bandwidth
awk '{ip[$1]++; bytes[$1]+=$10} END {for (i in ip) printf "%s: %d requests, %.2f MB
", i, ip[i], bytes[i]/1024/1024}' access.log | sort -k2 -rn | head -20
# 5. Calls per endpoint and average response time
awk '{url=$7; sub(/\?.*/,"",url); count[url]++; if ($NF ~ /^[0-9.]+$/) resp[url]+=$NF} END {printf "%-40s %10s %12s
", "Endpoint", "Calls", "Avg ms"; for (u in count) {avg=(u in resp)?resp[u]/count[u]:0; printf "%-40s %10d %12.2f
", u, count[u], avg}}' access.log | sort -k3 -rn | head -20Replace access.log with the actual log path when using these commands.
awk Script Files
For complex logic, store the program in a file to simplify debugging and reuse.
#!/usr/bin/awk -f
# File: analyze_nginx.awk
# Usage: awk -f analyze_nginx.awk access.log
BEGIN {FS = "[ ]+"; print "Starting analysis..."}
$9 ~ /^[0-9]{3}$/ {ip[$1]++; status[$9]++; url=$7; sub(/\?.*/,"",url); page[url]++; if ($NF ~ /^[0-9.]+$/) resp_sum[url]+=$NF; bytes[$1]+=$10}
END {
print "
========== Access Statistics =========="
print "Total requests:", NR
print "
=== Top 10 IPs ==="
for (i in ip) ranking[ip[i]]=i
count=0
for (p=999999; p>=0 && count<10; p--) if (ranking[p]) {print ranking[p], "=>", p, "times"; count++}
print "
=== Status Distribution ==="
for (s in status) {pct=status[s]*100/NR; printf " %s: %d (%.1f%%)
", s, status[s], pct}
print "
=== Top 10 Slow Endpoints (Avg ms) ==="
for (u in page) if (u in resp_sum && resp_sum[u]>0) avg=resp_sum[u]/page[u]; avg_ranking[avg"-"u]=u
count=0
for (p=999999; p>=0 && count<10; p--) for (key in avg_ranking) {split(key, parts, "-"); if (parts[1]==p) {printf " %s: %.2fms (%d calls)
", avg_ranking[key], p, page[avg_ranking[key]]; count++}}
}'Make the script executable ( chmod +x analyze_nginx.awk) and run it with the log file.
Performance Optimisation Tips
When processing large files, consider:
Initialising variables in a BEGIN block to avoid per‑line overhead.
Using next to skip irrelevant lines early.
Avoiding repeated string concatenation inside loops; use arrays instead.
Splitting massive files into chunks and processing them in parallel (e.g., split -l 100000 + parallel).
# Initialise in BEGIN
awk 'BEGIN {FS=","} $3>100 {count++} END {print count}' large.csv
# Skip header line
awk 'NR==1 {next} $3>100 {count++} END {print count}' large.csv
# Use array instead of repeated concatenation
awk '{arr[NR]=$1}' bigfile
# Parallel processing example
split -l 100000 big.log part_
for f in part_*; do awk '...' "$f"; done | awk '...'sed Practical Guide
How It Works
sedis a stream editor that reads one line into the pattern space, applies commands, then outputs the line. Core concepts:
Pattern space : the current line being processed.
Hold space : a secondary buffer for multi‑line operations.
Address : selects which lines to operate on (line numbers, regex, or conditions).
# Basic syntax
sed [options] 'command' file
# Common options
-n # suppress automatic printing (use with p)
-i # edit file in‑place (dangerous, use -i.bak for backup)
-e # execute multiple commands
-f # read commands from a file
-r # enable extended regular expressionsSubstitution Command s
Syntax: s/pattern/replacement/flags.
# Replace first occurrence of "error" on each line
sed 's/error/ERROR/' error.log
# Global replacement
sed 's/error/ERROR/g' error.log
# Case‑insensitive replacement (requires -i)
sed 's/error/ERROR/gi' error.log
# Replace only on line 3
sed '3s/error/ERROR/' error.log
# Replace lines 3‑7
sed '3,7s/error/ERROR/g' error.log
# Replace only lines matching "error"
sed '/error/s/error/ERROR/g' error.log
# Delete matching lines
sed '/error/d' error.log
# Preview replacement without modifying file
sed -n 's/old/new/p' file.txt
# Backup before in‑place edit
sed -i.bak 's/old/new/g' file.txt
# Use alternative delimiter to avoid escaping slashes
sed 's#/etc/nginx#/opt/nginx#g' config.conf
# Extended regex: collapse multiple spaces
sed -r 's/ +/ /g' file.txt
# Capture group example: mask IP address
sed -E 's/(192\.168\.1\.)[0-9]+/\1XXX/' config.txt
# Multiple replacements in one command
sed -e 's/old1/new1/g' -e 's/old2/new2/g' file.txt
# Apply commands from a file
sed -f replace.txt error.logAddresses and Ranges
Specify which lines to act on:
# Single line
sed '5s/old/new/' file.txt
# Range of lines
sed '1,10s/old/new/g' file.txt
# From line 5 to end
sed '5,$s/old/new/g' file.txt
# Regex‑matched lines
sed '/error/s/old/new/g' file.txt
# From "start" to "end"
sed '/start/,/end/s/old/new/g' file.txt
# Inverse address (exclamation mark)
sed '5!d' file.txt # keep only line 5
sed '/error/!s/old/new/g' file.txt # replace only non‑error lines
# Step address: every 5th line
sed '1~5s/old/new/g' file.txtDelete Command d
Deletion is risky; preview with -n and p first.
# Delete empty lines
sed '/^$/d' file.txt
# Delete lines containing only whitespace
sed '/^[[:space:]]*$/d' file.txt
# Delete comment lines starting with '#'
sed '/^#/d' config.conf
# Trim leading spaces/tabs
sed 's/^[ \t]*//' file.txt
# Trim trailing spaces/tabs
sed 's/[ \t]*$//' file.txt
# Delete a specific line
sed '1d' file.txt
# Delete the last line
sed '$d' file.txt
# Delete a range of lines
sed '1,10d' file.txt
# Delete a line and the next two lines after a match
sed '/error/,+2d' file.txtInsert, Append, and Change
# Insert before line 10
sed '10i
ew line content' file.txt
# Append after line 10
sed '10a
ew line content' file.txt
# Insert before a pattern
sed '/pattern/i
ew line before pattern' file.txt
# Append after a pattern
sed '/pattern/a
ew line after pattern' file.txt
# Replace entire line
sed '10c
ew entire line content' file.txt
# Insert at file start
sed '1i\Header line' file.txt
# Append at file end
sed '$a\Footer line' file.txt
# Insert a blank line after each line
sed 'G' file.txt
# Insert a blank line only after matching lines
sed '/pattern/G' file.txtMulti‑Line Processing
By default sed works line‑by‑line; the N command can join the next line to the pattern space.
# Delete from "error start" to "error end" inclusive
sed '/error start/,/error end/N; /error end/d' file.txt
# Collapse consecutive empty lines into a single line
sed '/^$/N;/^
$/d' file.txt
# Convert Unix line endings to Windows
sed 's/$/\r/' unix.txt > windows.txt
# Convert Windows line endings to Unix
sed -i 's/\r$//' file.txtFile I/O and Pipelines
# Insert file content after line 5
sed '5r /etc/hosts' file.txt
# Write lines 5‑10 to a new file
sed '5,10w /tmp/extracted.txt' file.txt
# Write a section delimited by markers to a file
sed '/START/,/END/w /tmp/section.txt' file.txt
# Pipe input through sed
cat file.txt | sed 's/old/new/g'
# Process multiple files and redirect output
sed 's/old/new/g' file1.txt file2.txt > output.txt
# In‑place edit of multiple files
sed -i 's/old/new/g' file1.txt file2.txt file3.txtReal‑World Example: Bulk Configuration Modification
Common sysadmin task: modify Nginx configuration across many servers.
# Change all "listen 80;" to "listen 8080;"
sed -i 's/listen\s*80;/listen 8080;/g' /etc/nginx/conf.d/*.conf
# Replace old domain with new domain
sed -i 's/server_name\s*old-domain.com;/server_name new-domain.com;/g' /etc/nginx/conf.d/*.conf
# Add a custom header at the beginning of each server block
sed -i '/server {/a\ add_header X-Server "nginx-1.24" always;' /etc/nginx/nginx.conf
# Delete all empty lines
sed -i '/^$/d' /etc/nginx/nginx.conf
# Append a timeout directive after line 25
sed -i '25a\proxy_read_timeout 300;' /etc/nginx/nginx.confSimilar patterns apply to MySQL, application property files, Docker configuration, etc.
sed Script Files
For complex replacements, store commands in a script file.
#!/bin/bash
# batch_modify.sh – bulk Nginx config changes
OLD_DOMAIN="old.example.com"
NEW_DOMAIN="new.example.com"
CONFIG_DIR="/etc/nginx/conf.d"
BACKUP_DIR="/tmp/nginx_backup_$(date +%Y%m%d%H%M%S)"
mkdir -p "$BACKUP_DIR"
for conf in "$CONFIG_DIR"/*.conf; do
[ -f "$conf" ] && cp "$conf" "$BACKUP_DIR/"
echo "Backed up: $conf"
done
# Domain replacement
sed -i "s/server_name\s*$OLD_DOMAIN;/server_name $NEW_DOMAIN;/g" "$CONFIG_DIR"/*.conf
# Add security headers
sed -i '/server {/a\ add_header X-Content-Type-Options "nosniff" always;' "$CONFIG_DIR"/*.conf
sed -i '/server {/a\ add_header X-Frame-Options "SAMEORIGIN" always;' "$CONFIG_DIR"/*.conf
sed -i '/server {/a\ add_header X-XSS-Protection "1; mode=block" always;' "$CONFIG_DIR"/*.conf
# Verify syntax
nginx -t && echo "Configuration updated successfully" || { echo "Nginx test failed, restoring backup"; cp "$BACKUP_DIR"/*.conf "$CONFIG_DIR"; exit 1; }Combining awk and sed
Using both tools in a pipeline enables sophisticated processing.
# Pre‑process with sed, then analyse with awk
sed 's/错误/error/g; s/警告/warning/g; s/成功/success/g' app.log | \
awk '/error/ {count++} END {print "Total errors:", count}'
# Extract key‑value pairs with awk, format as a table with sed
awk -F'=' '{print $1, $2}' config.txt | \
sed 's/^/| /; s/$/ |/; s/ */ | /g'
# Use sed to normalise delimiters, then awk for statistics
sed 's/|/,/g' app.log | \
awk -F',' '{status[$3]++; module[$4]++} END {print "=== By Status ==="; for (s in status) print s, status[s]; print "=== By Module ==="; for (m in module) print m, module[m]}'
# Extract URLs from HTML with awk
awk -F'href="' '{n=split($0, parts, "href=\""); for(i=2;i<=n;i++){split(parts[i], url, "\""); print url[1]}}' page.html | grep -v '^$'
# Flexible field handling: print last and second‑last fields
awk '{print $NF; n=NF; print $(n-1)}' file.txt
# Join two files (like SQL JOIN) using awk
awk -F',' 'NR==FNR {user[$1]=$2; next} $2 in user {print user[$2], $1, $3}' users.txt orders.txtCommon Pitfalls and Troubleshooting
awk and sed Traps
# Trap 1: -i modifies files directly – always backup first
sed -i.bak 's/old/new/g' file.txt
# Trap 2: Escape special characters in regex
sed 's/192\.168\.1\.1/192.168.1.100/g' file.txt
# Trap 3: Variables inside single quotes are not expanded
NEW_VALUE="new"
sed "s/old/$NEW_VALUE/g" file.txt
# Correct: use double quotes or -v for sed, -v for awk
# Trap 4: $0 in awk is not a shell variable
VAR=100
awk -v var=$VAR '$1 == var' file.txt
# Trap 5: awk may load the whole file into memory – split large files
split -l 100000 bigfile chunk_
for f in chunk_*; do awk '...' "$f"; done > result.txt
# Trap 6: Hidden spaces/tabs – use cat -A to visualise
cat -A file.txt
# Trap 7: Locale issues with Unicode – set LC_ALL=C for pure byte processing
LC_ALL=C awk '{print $1}' chinese.txt
# Trap 8: Multi‑line records – adjust RS (record separator)
awk 'BEGIN{RS=""; FS="
"} {for(i=1;i<=NF;i++) print $i}' app.logDebugging Techniques
# View intermediate sed output
sed 's/old/new/g' file.txt | head
# Simulate changes without writing back
sed 's/old/new/g' file.txt > new_file.txt
# Print debugging info in awk
awk '{print "Processing line:", NR; print "First field:", $1; print "Last field:", $NF}' file.txt
# Send debug messages to stderr
awk '{if (DEBUG) print "DEBUG:", $0 > "/dev/stderr"}' DEBUG=1 file.txt
# Watch a log file and process new lines automatically (requires inotify-tools)
inotifywait -m -e modify /var/log/app.log | \
while read; do awk '...' /var/log/app.log; doneComprehensive Real‑World Script: Nginx Log Analyzer
A complete bash script that combines awk and sed to generate a detailed report.
#!/bin/bash
# nginx_log_analyzer.sh – generate Nginx access log report
set -e
LOG_FILE="${1:-/var/log/nginx/access.log}"
REPORT_FILE="${2:-nginx_report_$(date +%Y%m%d_%H%M%S).txt}"
# Validate input file
if [ ! -f "$LOG_FILE" ]; then echo "Error: $LOG_FILE does not exist"; exit 1; fi
if [ ! -s "$LOG_FILE" ]; then echo "Warning: $LOG_FILE is empty"; exit 0; fi
# Header
echo "========================================" | tee "$REPORT_FILE"
echo "Nginx Access Log Analysis Report" | tee -a "$REPORT_FILE"
echo "Log file: $LOG_FILE" | tee -a "$REPORT_FILE"
echo "Generated at: $(date '+%Y-%m-%d %H:%M:%S')" | tee -a "$REPORT_FILE"
echo "========================================" | tee -a "$REPORT_FILE"
# Total requests
TOTAL_REQ=$(awk 'END {print NR}' "$LOG_FILE")
echo "[Total Requests] $TOTAL_REQ" | tee -a "$REPORT_FILE"
# HTTP status distribution
echo "[HTTP Status Distribution]" | tee -a "$REPORT_FILE"
awk '{status[$9]++} END {for (s in status) {pct=status[s]*100/NR; printf " %s: %d (%.2f%%)
", s, status[s], pct}}' "$LOG_FILE" | sort | tee -a "$REPORT_FILE"
# Top 20 IPs
echo "[Top 20 IPs]" | tee -a "$REPORT_FILE"
awk '{ip[$1]++} END {for (i in ip) print ip[i], i | "sort -rn | head -20"}' "$LOG_FILE" | while read count ip; do printf " %s: %d requests
" "$ip" "$count"; done | tee -a "$REPORT_FILE"
# Top 20 slow requests (>1 s)
echo "[Top 20 Slow Requests (>1 s)]" | tee -a "$REPORT_FILE"
awk 'NF>=9 && $NF ~ /^[0-9.]+$/ && $NF>1 {printf " %s %s response %.2fs
", $4, $7, $NF}' "$LOG_FILE" | sort -k3 -rn | head -20 | tee -a "$REPORT_FILE"
# Top 20 endpoints by call count and average response time
echo "[Top 20 Endpoints (by calls)]" | tee -a "$REPORT_FILE"
awk '{url=$7; sub(/\?.*/,"",url); count[url]++; if ($NF ~ /^[0-9.]+$/) resp[url]+=$NF} END {for (u in count) {avg=(u in resp)?resp[u]/count[u]:0; print count[u], avg, u | "sort -rn | head -20"}}' "$LOG_FILE" | while read cnt avg url; do printf " %d: %s (avg %.2f ms)
" "$cnt" "$url" "$avg"; done | tee -a "$REPORT_FILE"
# Bandwidth consumption (field $10 is body_bytes_sent)
echo "[Bandwidth Consumption]" | tee -a "$REPORT_FILE"
awk '{bytes+=$10} END {mb=bytes/1024/1024; gb=mb/1024; printf " Total traffic: %.2f MB (%.4f GB)
", mb, gb}' "$LOG_FILE" | tee -a "$REPORT_FILE"
# Hourly request distribution
echo "[Hourly Request Distribution]" | tee -a "$REPORT_FILE"
awk '{hour=substr($4,13,2); hourly[hour]++} END {for (h=0; h<=23; h++) {hh=sprintf("%02d",h); printf " %s:00‑%s:59: %d requests
", hh, hh, hourly[h]+0}}' "$LOG_FILE" | tee -a "$REPORT_FILE"
# Top 10 User‑Agent statistics
echo "[Top 10 User‑Agents]" | tee -a "$REPORT_FILE"
awk -F'"' '{ua=$6; if (ua!="") {sub(/^ */,"",ua); count[ua]++}} END {for (u in count) print count[u], u | "sort -rn | head -10"}' "$LOG_FILE" | while read cnt ua; do printf " %d: %s
" "$cnt" "$ua"; done | tee -a "$REPORT_FILE"
# Error request statistics
echo "[Error Request Statistics]" | tee -a "$REPORT_FILE"
awk '{if ($9>=500) e5xx++; if ($9==404) e404++; if ($9==403) e403++; if ($9==400) e400++} END {printf " 5xx errors: %d
", e5xx+0; printf " 404 errors: %d
", e404+0; printf " 403 forbidden: %d
", e403+0; printf " 400 errors: %d
", e400+0}' "$LOG_FILE" | tee -a "$REPORT_FILE"
# Footer
echo "========================================" | tee -a "$REPORT_FILE"
echo "Report generated: $REPORT_FILE" | tee -a "$REPORT_FILE"
echo "========================================" | tee -a "$REPORT_FILE"Performance Comparison and Tool Selection
awk vs sed vs grep vs cut
Field extraction : awk, cut – awk offers richer functionality.
Simple substitution : sed – most concise syntax.
Complex statistics : awk – powerful arrays and functions.
Line filtering : grep, sed – grep is straightforward; sed can modify files directly.
Formatted output : awk – strong printf capabilities.
Conditional processing : awk – natural condition syntax.
Cross‑line handling : awk – native RS support.
Large file handling : awk, grep – lower memory footprint than sed.
Performance Benchmarks (100 MB log file)
# Field extraction (cut vs awk)
time cut -d' ' -f1 access.log | head -100000 > /dev/null # ~0.5 s
time awk '{print $1}' access.log | head -100000 > /dev/null # ~0.8 s
# Global substitution
time sed 's/error/ERROR/g' access.log > /dev/null # ~1.2 s
time awk '{gsub(/error/,"ERROR"); print}' access.log > /dev/null # ~2.5 s
# Count unique IPs
time awk '{print $1}' access.log | sort | uniq -c | sort -rn > /dev/null # ~8 sConclusion
awkand sed are the Swiss‑army knives for sysadmins handling text. Use awk for structured data analysis, aggregations, and formatted reports; use sed for straightforward line‑oriented edits and bulk configuration changes. In practice, combine them in pipelines ( grep | awk | sed) to leverage each tool’s strengths.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Ops Community
A leading IT operations community where professionals share and grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
