Except for one little remaining chore: monitoring your log files. [insert horrible alarming music of your choice here.] You're conscientious, so you know you can't just ignore the logs until there's a problem, especially for public services like Web and mail. Somewhere up in the pointy-haired suites, they may even be plotting to require you to track and analyze all sorts of server statistics.
Not to worry, for there are many ways to implement data reduction, which is what log parsing is all about. You want to slice and dice your logs to present only the data you're interested in viewing. Unless you wish to devote your entire life to manually analyzing log files. Even if you only pay attention to logfiles when you're debugging a problem, having some tools to weed out the noise is helpful.
The simplest method is a keyword search. Suppose you want to separate out the 404 errors in your Apache log, and see if you have any missing files:
$ grep 404 bratgrrl.com-Aug-2004
22.214.171.124 - - [30/Aug/2004:02:25:13 -0700] "GET /robots.txt HTTP/1.0" 404 - "-"
126.96.36.199 - - [30/Aug/2004:10:32:26 -0700] "GET /robots.txt HTTP/1.0" 404 - "-"
"msnbot/0.11 ( +https://search.msn.com/msnbot.htm)"
188.8.131.52 - - [12/Aug/2004:06:49:11 -0700] "GET /favicon.ico HTTP/1.1" 404 - "-"
"Opera/7.21 (X11; Linux i686; U) [en]"
These entries are typical. This site has no
or favicon, so any requests for these files generate a 404 error. The first two are Web bots. The third entry is probably some random surfer. You can ignore these. So let's screen out