Skip to content

Instantly share code, notes, and snippets.

@rlunaro
Created April 27, 2024 07:04
Show Gist options
  • Save rlunaro/aad7bfc8e3cc608840d726591a552a25 to your computer and use it in GitHub Desktop.
Save rlunaro/aad7bfc8e3cc608840d726591a552a25 to your computer and use it in GitHub Desktop.
A simple log parser for apache log analysis

Simple log parser for apache

Sometimes it's useful to do it by yourself: examining the apache logs for searching certain patterns. In that ocassions, the most cumbersome thing is to start just parsing the log file. So this script comes in handy as an start thing to suit your needs and perform just the analysis you need.

So, given this log configuration:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

This awk script parses it correctly:

#
# log_parser.awk - para procesar los logs de la página 
#
#

BEGIN 	{
        }
        
/.*/	{
            ip = ""; 
            second = ""; 
            third = ""; 
            date = ""; 
            url = ""; 
            quantity1 = ""; 
            quantity2 = ""; 
            referrer = ""; 
            useragent = ""; 
            remainder = $0;
            if( match( remainder, /([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)\s(.*)/, a ) ){
                ip = a[1];
                remainder = a[2];
            }
            if( match( remainder, /([^ ]+) (.*)/, a ) ){
                second = a[1]; 
                remainder = a[2]; 
            }
            if( match( remainder, /([^ ]+) (.*)/, a ) ){
                third = a[1]; 
                remainder = a[2]; 
            }
            if( match( remainder, /\[([^\]]+)\] (.*)/, a ) ) {
                date = a[1]; 
                remainder = a[2]; 
            }
            if( match( remainder, /"([^"]+)" (.*)/, a ) ) {
                url = a[1]; 
                remainder = a[2]; 
            }
            if( match( remainder, /([0-9]+) (.*)/, a ) ) {
                quantity1 = a[1]; 
                remainder = a[2];
            }
            if( match( remainder, /([0-9]+) (.*)/, a ) ) {
                quantity2 = a[1]; 
                remainder = a[2];
            }
            if( match( remainder, /"([^"]+)" (.*)/, a ) ) {
                referrer = a[1];
                remainder = a[2];
            }
            if( match( remainder, /"([^"]+)"(.*)/, a ) ) {
                useragent = a[1];
                remainder = a[2];
            }
            #print $0;
            #print ip;
            #print second;
            #print third;
            #print date; 
            #print url; 
            #print quantity1;
            #print quantity2;
            #print referrer; 
            print useragent; 
        }

As you can see, the script is easly customizable and it's structured enough to segregate each element in a proper variable for further customization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment