Log monitoring with Censor
From FBSD_tips
Back to "Admin building blocks"
Keeping logs is a standard practice. Paying attention to them isn't. Even with one machine, consistently monitoring log files becomes tedious, which means it doesn't get done and/or you miss important details while your eyes glaze over. This is where reporting systems come in. Many systems generate alerts based on keywords, but if I could predict my system errors I would be making a lot more money. This system sends you everything by default in it's alerts and enables you to filter out non-problematic messages. It consists of two parts:
- syslog, a syslog daemon. I will give examples for both syslogd, which comes with FreeBSD, and the more powerful syslog-ng, which is in the ports tree. I've tested both successfully.
- Censor, a regular expressions filter and email reporting script that I wrote in Ruby. This was originally written using shell scripts, grep and FIFOs, but that method proved unreliable and very fragile.
I've used this system on a live log server monitoring a few dozen machines which processed an average of 12 messages/second. The hardware was a Celeron 800 MHz CPU with 128 MB of RAM and two IDE disks in a gvinum mirror. All messages were received over TCP within stunnel. The load average was nearly always 0.00. I mention this because Ruby isn't known for it's blazing speed, but it seems to do fairly well in this case.
The first thing we need to do is configure syslog to pipe it's messages to Censor. This is in addition to any other logging you're doing (don't replace your log files). If you use syslogd, add this into your /etc/syslog.conf file:
*.* |/usr/local/sbin/censor
If you use syslog-ng, add this to /usr/local/etc/syslog-ng/syslog-ng.conf (or wherever your configuration file lives):
destination censor { program("/usr/local/sbin/censor"); };
log { source(src); destination(censor); };
Next, set up an entry in /etc/crontab to periodically email the alerts (this one runs every five minutes):
*/5 * * * * root /bin/kill -SIGUSR1 `cat /var/run/censor`
Censor will read regular expression patterns from the /usr/local/etc/censor file when it starts and whenever it is sent a SIGUSR1 signal (which also causes it to send out the email). The format of this file is simple: lines that begin with "#" are considered comments and ignored, as are blank lines. Lines that begin and end with "/" are considered regular expressions and parsed literally, and anything else is considered a string and parsed as a regular expression after escaping special characters. The file must end with a newline character. For example:
# Localhost sendmail MTA. sm-mta # Syslog-ng statistics. /syslog-ng\[[0-9]+\]: STATS: dropped 0/ # Common normal messages. /\/usr\/sbin\/cron\[[0-9]+\]: \(operator\) CMD \(\/usr\/libexec\/save-entropy\)/ /\/usr\/sbin\/cron\[[0-9]+\]: \(root\) CMD \(\/bin\/kill -SIGUSR1 `cat \/var\/run\/censor`\)
Last but not least, the Censor script itself. The only lines you should have to change in the script are the "from" and "to" addresses, which govern where the alert email is sent from and to, respectively. Put this into /usr/local/sbin (or wherever your syslog daemon references it, above):
#!/usr/local/bin/ruby
from = 'censor@host.domain.com'
to = 'admin@domain.com'
class Censor
def initialize(file = '/usr/local/etc/censor', pid_file = '/var/run/censor')
@file = file
@pid_file = pid_file
@output = String.new
load_regex
end
def load_regex
current_mtime = File.mtime(@file)
@file_mtime = current_mtime if @file_mtime == nil
if (@regex == nil || current_mtime > @file_mtime) then
@regex = Regexp.union(*File.open(@file).readlines.map { |line| line.chomp! ; if line =~ /^\/.+\/$/ then Regexp.new(line[1,line.length-2]) elsif line =~ /^[^#].+/ then line end }.compact!)
@file_mtime = current_mtime
end
end
def run
begin
File.open(@pid_file,'w') { |pid_file|
pid_file.write(Process.pid)
pid_file.flush
@output << $_ if ($_ =~ @regex) == nil while gets
}
ensure
File.delete(@pid_file)
end
end
def dump
if @output.length > 0 then
puts @output
@output = ''
end
load_regex
end
def email(from, to)
if @output.length > 0 then
require 'net/smtp'
Net::SMTP.start('localhost') do |smtp|
smtp.send_message("Subject: censor\n\n" << @output, from, to)
end
@output = ''
end
load_regex
end
end
censor = Censor.new()
trap ("SIGUSR1") { censor.email(from, to) }
censor.run
Start syslog up as normal. You should see Censor in your process list and a PID file in /var/run. If so, then everything should be working and you should start receiving tons of email pretty soon. Tweak your /usr/local/etc/censor file to match your servers. Don't forget to send the SIGUSR1 signal to reload the file. Try to be specific so you don't cause the script to filter out more than you wanted, possibly missing important messages.
I know there are programs that accomplish the same thing in a similar way; I've used a couple of them. I initially was working on something that could be done with only utilities available in base, could be done in real-time (or close), and didn't care about log rotation. This led me to using FIFOs, grep and eventually a tiny Ruby script (for non-blocking reads). I was going to reimplement it in C eventually, but I continually had problems with the FIFOs, blocking I/O and grep. It worked most of the time, but that wasn't good enough. Maybe some day I (or someone else) will rewrite the current script in C, but it works great for now!
Possible future ideas:
- Clean up some problematic areas (make it more user-friendly and tolerant)
- Add some timing information to the alert so the user knows how long it takes to run. One wouldn't want the report to take ten minutes if it runs every five minutes.
- Have two classes of patterns: one normal class that is alerted periodically, and a critical class that is alerted immediately.
- Rewrite in C to get rid of the Ruby dependency.
FzZzT 05:50, 26 November 2007 (UTC)
