Finding top N clients of a standard
Apache HTTP Server
access-log
is quite easy.
First we let awk
remove everything but the first column which is the client IP address.
awk ‘{print $1}’ /path/to/access.log | sort | uniq -c | sort -n | tail
Then we pipe to sort
to sort the output.
Then we use uniq -c
to count the number each address is duplicated and remove the duplicates.
Then we use sort -n
to sort the lines that now are prefixed with the number of occurrences of each address.
Finally,. we use tail
to keep only the last 10 lines.
Example:
$ awk '{print $1}' log/access.log | sort | uniq -c | sort -n | tail
356 10.5.5.144
480 10.9.4.203
490 10.0.15.244
1180 10.8.33.10
1430 10.8.66.82
1472 10.3.19.162
1897 10.3.13.3
1908 10.3.13.6
2020 10.3.13.4
26171 10.0.9.5