Master Log Analysis with Grep, AWK, Cut, and Log Management Tools
Learn essential log analysis techniques using command‑line tools like grep, cut, and awk, explore advanced filtering with rsyslog and Grok, and discover how log management systems can automate parsing and accelerate troubleshooting of Linux syslog data.
The logs contain a massive amount of information that you need to process, and extracting useful data isn’t always as easy as it sounds. This guide introduces basic log‑analysis examples you can try right now and also covers more advanced techniques that, once set up, can save you a lot of time.
Using Grep for Search
Searching text is the most basic way to find information. The most common tool for text search is grep, available on most Linux distributions, which lets you search logs with regular expressions.
Regular Expressions
Example: find "user hoover" in Ubuntu authentication logs.
$ grep "user hoover" /var/log/auth.log Accepted password for hoover from 10.0.2.2 port 4792 ssh2 pam_unix(sshd:session): session opened for user hoover by (uid=0) pam_unix(sshd:session): session closed for user hooverBuilding precise regular expressions can be tricky. For instance, searching for the number "4792" may also match timestamps, URLs, or other unwanted data.
$ grep "4792" /var/log/auth.log Accepted password for hoover from 10.0.2.2 port 4792 ssh2 74.91.21.46 - - [31/Mar/2015:19:44:32+0000] "GET /scripts/samples/search?q=4972 HTTP/1.0" 404 545 "-" "-"Surround Search
Grep’s -B (before) and -A (after) options let you view lines surrounding a match, which helps debug errors.
$ grep -B3 -A2 'Invalid user' /var/log/auth.log Apr 28 17:06:20 ip-172-31-11-241 sshd[12545]: reverse mapping checking getaddrinfo for 216.19.2.8.commspeed.net [216.19.2.8] failed - POSSIBLE BREAK-IN ATTEMPT! Apr 28 17:06:20 ip-172-31-11-241 sshd[12545]: Received disconnect from 216.19.2.8:11: Bye Bye [preauth] Apr 28 17:06:20 ip-172-31-11-241 sshd[12547]: Invalid user admin from 216.19.2.8Tail
You can combine grep with tail -f to watch a file in real time and filter for specific patterns.
$ tail -f /var/log/auth.log | grep 'Invalid user' Apr 30 19:49:48 ip-172-31-11-241 sshd[6512]: Invalid user ubnt from 219.140.64.136 Apr 30 19:49:49 ip-172-31-11-241 sshd[6514]: Invalid user admin from 219.140.64.136For a deeper dive into grep and regular expressions, see Ryan’s Tutorials.
Parsing with Cut, AWK, and Grok
Cut
The cut command extracts fields from delimited logs. Example: extract the 8th field (the user) from an authentication failure line.
$ grep "authentication failure" /var/log/auth.log | cut -d '=' -f 8 root hoover root nagiosAWK
AWK provides a more powerful scripting language for field extraction.
$ awk '/sshd.*invalid user/ { print $9 }' /var/log/auth.log guest admin info test ubntRead the AWK user guide for more details on regular expressions and output fields.
Log Management Systems
Log management platforms simplify parsing by automatically handling common log formats (e.g., Linux syslog, web server logs) and allowing custom parsing for non‑standard formats using tools like Grok.
Example screenshot from Loggly (cloud‑based log management service):
Grok uses a library of regular expressions to turn raw text into structured JSON. Example Logstash Grok configuration for kernel logs:
filter {
grok {
match => { "message" => "%{CISCOTIMESTAMP:timestamp} %{HOST:host} %{WORD:program}%{NOTSPACE} %{NOTSPACE}%{NUMBER:duration}%{NOTSPACE} %{GREEDYDATA:kernel_logs}" }
}
}Resulting parsed output (screenshot):
Filtering with Rsyslog and AWK
Filtering lets you retrieve specific field values instead of performing full‑text searches, making analysis more accurate.
Filtering by Application
Use rsyslog to direct logs from a particular application (e.g., sshd) to a dedicated file:
:programname, isequal, "sshd" /var/log/sshd-messages &~Or use AWK to extract a field, such as the sshd username:
$ awk '/sshd.*invalid user/ { print $9 }' /var/log/auth.logLog management systems can also filter by application name with a click, as shown in Loggly screenshots.
Filtering Errors
Syslog’s default configuration doesn’t expose error severity directly. Modify rsyslog to include the priority text:
"<%pri-text%> : %timegenerated%,%HOSTNAME%,%syslogtag%,%msg%n"Example output shows the err severity:
<authpriv.err> : Mar 11 18:18:00,hoover-VirtualBox,su[5026]:, pam_authenticate: Authentication failureYou can then grep for '.err>' or use a log management system to highlight error domains.
$ grep '.err>' /var/log/auth.log <authpriv.err>:Mar 11 18:18:00,hoover-VirtualBox,su[5026]:, pam_authenticate: Authentication failureLoggly screenshot showing filtered error severity:
Source: http://www.loggly.com/ultimate-guide/logging/analyzing-linux-logs/
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
