Operations 13 min read

Master Log Analysis with Grep, AWK, Cut, and Log Management Tools

Learn essential log analysis techniques using command‑line tools like grep, cut, and awk, explore advanced filtering with rsyslog and Grok, and discover how log management systems can automate parsing and accelerate troubleshooting of Linux syslog data.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Master Log Analysis with Grep, AWK, Cut, and Log Management Tools

The logs contain a massive amount of information that you need to process, and extracting useful data isn’t always as easy as it sounds. This guide introduces basic log‑analysis examples you can try right now and also covers more advanced techniques that, once set up, can save you a lot of time.

Using Grep for Search

Searching text is the most basic way to find information. The most common tool for text search is grep, available on most Linux distributions, which lets you search logs with regular expressions.

Regular Expressions

Example: find "user hoover" in Ubuntu authentication logs.

$ grep "user hoover" /var/log/auth.log
Accepted password for hoover from 10.0.2.2 port 4792 ssh2
pam_unix(sshd:session): session opened for user hoover by (uid=0)
pam_unix(sshd:session): session closed for user hoover

Building precise regular expressions can be tricky. For instance, searching for the number "4792" may also match timestamps, URLs, or other unwanted data.

$ grep "4792" /var/log/auth.log
Accepted password for hoover from 10.0.2.2 port 4792 ssh2
74.91.21.46 - - [31/Mar/2015:19:44:32+0000] "GET /scripts/samples/search?q=4972 HTTP/1.0" 404 545 "-" "-"

Surround Search

Grep’s -B (before) and -A (after) options let you view lines surrounding a match, which helps debug errors.

$ grep -B3 -A2 'Invalid user' /var/log/auth.log
Apr 28 17:06:20 ip-172-31-11-241 sshd[12545]: reverse mapping checking getaddrinfo for 216.19.2.8.commspeed.net [216.19.2.8] failed - POSSIBLE BREAK-IN ATTEMPT!
Apr 28 17:06:20 ip-172-31-11-241 sshd[12545]: Received disconnect from 216.19.2.8:11: Bye Bye [preauth]
Apr 28 17:06:20 ip-172-31-11-241 sshd[12547]: Invalid user admin from 216.19.2.8

Tail

You can combine grep with tail -f to watch a file in real time and filter for specific patterns.

$ tail -f /var/log/auth.log | grep 'Invalid user'
Apr 30 19:49:48 ip-172-31-11-241 sshd[6512]: Invalid user ubnt from 219.140.64.136
Apr 30 19:49:49 ip-172-31-11-241 sshd[6514]: Invalid user admin from 219.140.64.136

For a deeper dive into grep and regular expressions, see Ryan’s Tutorials.

Parsing with Cut, AWK, and Grok

Cut

The cut command extracts fields from delimited logs. Example: extract the 8th field (the user) from an authentication failure line.

$ grep "authentication failure" /var/log/auth.log | cut -d '=' -f 8
root
hoover
root
nagios

AWK

AWK provides a more powerful scripting language for field extraction.

$ awk '/sshd.*invalid user/ { print $9 }' /var/log/auth.log
guest
admin
info
test
ubnt

Read the AWK user guide for more details on regular expressions and output fields.

Log Management Systems

Log management platforms simplify parsing by automatically handling common log formats (e.g., Linux syslog, web server logs) and allowing custom parsing for non‑standard formats using tools like Grok.

Example screenshot from Loggly (cloud‑based log management service):

Grok uses a library of regular expressions to turn raw text into structured JSON. Example Logstash Grok configuration for kernel logs:

filter {
  grok {
    match => { "message" => "%{CISCOTIMESTAMP:timestamp} %{HOST:host} %{WORD:program}%{NOTSPACE} %{NOTSPACE}%{NUMBER:duration}%{NOTSPACE} %{GREEDYDATA:kernel_logs}" }
  }
}

Resulting parsed output (screenshot):

Filtering with Rsyslog and AWK

Filtering lets you retrieve specific field values instead of performing full‑text searches, making analysis more accurate.

Filtering by Application

Use rsyslog to direct logs from a particular application (e.g., sshd) to a dedicated file:

:programname, isequal, "sshd" /var/log/sshd-messages
&~

Or use AWK to extract a field, such as the sshd username:

$ awk '/sshd.*invalid user/ { print $9 }' /var/log/auth.log

Log management systems can also filter by application name with a click, as shown in Loggly screenshots.

Filtering Errors

Syslog’s default configuration doesn’t expose error severity directly. Modify rsyslog to include the priority text:

"<%pri-text%> : %timegenerated%,%HOSTNAME%,%syslogtag%,%msg%n"

Example output shows the err severity:

<authpriv.err> : Mar 11 18:18:00,hoover-VirtualBox,su[5026]:, pam_authenticate: Authentication failure

You can then grep for '.err>' or use a log management system to highlight error domains.

$ grep '.err>' /var/log/auth.log
<authpriv.err>:Mar 11 18:18:00,hoover-VirtualBox,su[5026]:, pam_authenticate: Authentication failure

Loggly screenshot showing filtered error severity:

Source: http://www.loggly.com/ultimate-guide/logging/analyzing-linux-logs/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Log Managementlog analysisGrepawksyslog
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.