How to get the number of unique IP addresses and errors in shell scripts using uniq or awk?

advertisements

I am doing a nslookup on URLs for multiple iterations using shell script. I need to check how many times IP was returned for each URL.

In output file, output is stored as

URL
IP address

using uniq -c command I get the count when same IP addresses are adjacent but not when same IP addresses are on non-adjacent line

Command is
cat file.log | awk '{print $1}' | uniq -c

here is the sample output

1 url
3 72.51.46.230

Now if multiple IP addresses are returned for a particular URL and they are on non-adjacent lines because I have run no. of iterations. In that case uniq-c command will not work. If I use sort option it sorts but I need to display the output as above for each URL ie. URL and next line with the count and its IP address.

For eg. if I do nslookup on google.com it will return multiple addresses and I do uniq -c I get following output. As you see there are same IP addresses but count is only 1 as uniq -c does not work on non-adjacent lines.

  1 74.125.236.64
  1 74.125.236.78
  1 74.125.236.67
  1 74.125.236.72
  1 74.125.236.65
  1 74.125.236.73
  1 74.125.236.70
  1 74.125.236.66
  1 74.125.236.68
  1 74.125.236.71
  1 74.125.236.69
  1 nslookup: can't resolv 'google.com'
  1 nslookup: can't resolv 'google.com'
  1 nslookup: can't resolv 'google.com'
  1 nslookup: can't resolv 'google.com'
  1 nslookup: can't resolv 'google.com'
  1 nslookup: can't resolv 'google.com'
  1 nslookup: can't resolv 'google.com'
  1 74.125.236.70
  1 74.125.236.66
  1 74.125.236.68
  1 74.125.236.71
  1 74.125.236.69

I tried with AWK as well but in that case output is not formatted as I require.

Awk command

awk '{a[$0]++}END{for (i in a) printf "%-2d -> %s \n", a[i], i}' file.log

Can you suggest a better solution to achieve this - Get count and Display in the format as mentioned above?

Output format desired is

URL
Count IP address

sample input file.

URL1
72.51.46.230
72.51.46.230
google.com
74.125.236.64
74.125.236.78
(null)
nslookup: can't resolv 'google.com'
nslookup: can't resolv 'google.com'
nslookup: can't resolv 'google.com'
nslookup: can't resolv 'google.com'
nslookup: can't resolv 'google.com'

Sample Output required as

URL1
2 72.51.46.230
google.com
1 74.125.236.64
1 74.125.236.78
1 null
5 nslookup: can't resolv 'google.com'

Thank you.


The following awk script does the job:

$1~/[a-z]+[.].*/{         # If line have a letter in must be a URL
    for(i in ip)          # Print all the counts and IPs (empty first time)
         print ip[i],i
    delete ip             # Delete array for next set of IP's
    print                 # Print the URL
    next                  # Skip to next line
}
{
    ip[$0]++              # If here line contains IP, increment the count per IP
}
END{                      # Reached end of file need to print the last set of IPs
    for(i in ip)
        print ip[i],i
}

Save it as script.awk and run like:

$ awk -f script.awk file
creativecommons.org
2 72.51.46.230
google.com
5 nslookup: can't resolv 'google.com'
1 (null)
1 74.125.236.64
1 74.125.236.78