I spend a large amount of time defending from spam attacks and sql injection attacks. I can analyze the httpd logs with the following:
grep schem ./access_log* |cut -d ' ' -f 2 |uniq -c |sort -n
- The 'grep' command searches for the word schema as in information_schema. No real sql query searches for this. It is always an sql hacking attempt.
- The files we are searching is 'access_log*' which means search through all the access logs that we have. For me, that is usually around 4 months of data. That is a fairly good data set.
- The 'cut' command chunks up the data. The '-d' part tells how to chunck the data; by a space character. The '-f 2' tells what data to collect; the second item in each line.
- The 'uniq -c' tells to count each unique item.
- The 'sort -n' sorts them least to greatest.