Operations 10 min read

Master Apache Log Analysis with 20 Essential Linux Commands

This guide presents a curated collection of 20 practical Linux one‑liners—using awk, grep, netstat, and other shell tools—to extract IP counts, page views, bandwidth, error rates, concurrency, and other key metrics from Apache access logs, enabling quick and thorough server traffic analysis.

Liangxu Linux
Liangxu Linux
Liangxu Linux
Master Apache Log Analysis with 20 Essential Linux Commands

Overview

The article compiles a set of useful shell commands for analyzing Apache HTTP server logs. Each command focuses on a specific metric such as unique visitor IPs, page request frequencies, bandwidth usage, error codes, or concurrent connections.

Key Commands

Count distinct IP addresses : awk '{print $1}' log_file | sort | uniq | wc -l Count accesses to a specific page : grep "/index.php" log_file | wc -l Show how many pages each IP accessed :

awk '{++S[$1]} END {for (a in S) print a,S[a]}' log_file | sort -n -t ' ' -k 2

Sort IPs by number of pages visited (ascending) :

awk '{++S[$1]} END {for (a in S) print S[a],a}' log_file | sort -n

List pages visited by a given IP : grep ^111.111.111.111 log_file | awk '{print $1,$7}' Exclude search‑engine crawlers :

awk '{print $12,$1}' log_file | grep ^"Mozilla" | awk '{print $2}' | sort | uniq | wc -l

IP count for a specific hour :

awk '{print $4,$1}' log_file | grep 16/Aug/2015:14 | awk '{print $2}' | sort | uniq | wc -l

Top 10 IP addresses by request count :

awk '{print $1}' | sort | uniq -c | sort -nr | head -10 access_log

Top 10 most requested URLs :

cat log_file | awk '{print $11}' | sort | uniq -c | sort -nr | head -10

Top URLs by referer (sub‑domain) :

cat access.log | awk '{print $11}' | sed -e 's/http:\/\///' -e 's/\/.*//' | sort | uniq -c | sort -rn | head -20

Largest transferred files (by size) :

cat www.access.log | awk '($7~/\.php/){print $10 " " $1 " " $4 " " $7}' | sort -nr | head -100

Pages larger than 200 KB and their hit counts :

cat www.access.log | awk '($10 > 200000 && $7~/\.php/){print $7}' | sort -n | uniq -c | sort -nr | head -100

Slowest PHP pages (response time > 60 s) :

cat www.access.log | awk '($NF > 60 && $7~/\.php/){print $7}' | sort -n | uniq -c | sort -nr | head -100

Current Apache process count :

ps -ef | awk -F ' ' '{print $8 " " $9}' | sort | uniq -c | sort -nr | head -20

Active connections (ESTABLISHED) : netstat -an | grep ESTABLISHED | wc -l Total 80‑port connections : netstat -nat | grep -i "80" | wc -l IP‑wise connection statistics :

netstat -n | awk '/^tcp/ {n=split($(NF-1),array,":");if(n<=2)++S[array[(1)]];else++S[array[(4)]];++s[$NF];++N} END {for(a in S){printf("%-20s %s
", a, S[a]);++I}printf("%-20s %s
","TOTAL_IP",I);for(a in s) printf("%-20s %s
",a, s[a]);printf("%-20s %s
","TOTAL_LINK",N);}'

Bandwidth (GB) used :

cat access.log | awk '{sum+=$10} END {print sum/1024/1024/1024}'

Count of 404 responses : awk '($9 ~/404/)' access.log | awk '{print $9,$7}' | sort HTTP status distribution :

cat access.log | awk '{counts[$(9)]+=1}; END {for(code in counts) print code, counts[code]}'

Requests per second (watch) :

watch "awk '{if($9~/200|30|404/)COUNT[$4]++}END{for(a in COUNT) print a,COUNT[a]}' log_file | sort -k 2 -nr | head -n10"

Additional Insights

The commands rely on standard Apache log format fields: $1 (client IP), $4 (timestamp), $7 (requested URL), $9 (HTTP status), $10 (bytes transferred), and $NF (last field, often request time). By chaining awk, grep, sort, uniq, and head, administrators can quickly generate reports without installing extra tools.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

OperationsShellApacheawk
Liangxu Linux
Written by

Liangxu Linux

Liangxu, a self‑taught IT professional now working as a Linux development engineer at a Fortune 500 multinational, shares extensive Linux knowledge—fundamentals, applications, tools, plus Git, databases, Raspberry Pi, etc. (Reply “Linux” to receive essential resources.)

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.