Introduction
This article is really an extention to setting up squid as a reverse proxy, and is about my method for blending the squid log into my apache log. I must point out, that my method is rather crude and I think a better solution than my shell scripts exists, but I couldn’t find it.
Reverse Proxy
I use squid as an http accelerator for my websites. It sits between the browser and apache as a reverse proxy and reduces the load (some times significantly) on the webserver.
If you don’t know about setting this up I suggest this white paper.
Converting Squid Logs to Apache Format
If you have squid 3, then you can just specify a custom log format in the config.
If you have an earlier version of squid, you may be able to install the patch to add the custom log functionality. After a lot of effort, I failed to install this path on my debian system, so I reverted using the mime headers log option and a perl script.
Add the following option to squid.conf:
log_mime_hdrs on
This records enough info for us to find the neccessary data for usual apache log details.
This is our convert script:
#!/usr/bin/perl -w
use IO::File;
$| = 1;
my $failures = 0;
my $pid = 0;
$fh = new IO::File "/var/run/squid.pid", "r";
if (defined $fh) {
$pid = <$fh>;
undef $fh; # automatically closes the file
}
else
{
print STDERR "squid not running?\n";
}
while (<>) {
/^\s*([^ ]+) +([^ ]+) +([^ ]+) +(\[[^]]+\]) +("[^"]*") +([^ ]+) +([^ ]+) +([^ ]*) +\[([^]]*)\]/ && do {
my ($ip, $user, $vhost, $date, $req, $resp, $bytes, $header) = ($1, $2, $3, $4, $5, $6, $7, $9);
my ($refer, $uagent) = qw(- -);
$header =~ /Referr?er:\s*(.*?)\\r\\n/ && do {
$refer = $1;
};
$header =~ /User-Agent:\s*(.*?)\\r\\n/ && do {
$uagent = $1;
};
$uagent =~ s/%5b/[/g;
$uagent =~ s/%5d/]/g;
$site = $req;
$site =~ s/^.*http:\/\/([^:]*):81.*$/$1/g;
$req =~ s/^(.*)http:\/\/[^:]*:81(\/.*)$/$1 $2/g;
#lets set user to squid as we don't use http auth for anythng else
$user =~ s/-/proxy/g;
#remote logname?
$logname = '-';
print "$site $ip $logname $user $date $req $resp $bytes \"$refer\" \"$uagent\" $pid\n";
next;
};
$failures++;
}
if ($failures) {
print STDERR "$failures records ignored\n";
}
NB I insert http user to be proxy – this is a crude way for me to identify the percentage of procy requests in the log.
You can simply pass the squid log through this script and you’ll get an apache format log e.g.
cat access.log | squid-convert.pl
However in order for the web logs to be useful, the squid log needs to be interwoven with the apache log. If you just append the apache log, you’ll lose continuity in the dates, which can upset stats analysis.
The method I use, if to create a named pipe (fifo) and have squid log it’s output to this, whilst another process simultaneously reads the output and pipes it through the converter script and into the apache log, so the log is updated in real time, thus preserving continuity.
So the following needs to be in squid.conf:
cache_access_log /var/log/squid/fifo
And the following script must be started:
#!/bin/bash
squid_convert_log=/usr/local/sbin/squid-convert.pl
# variables
SQUID_LOG_DIR=/var/log/squid
SQUID_LOG_FILE=$SQUID_LOG_DIR/access.log
APACHE_LOG_FILE=/var/log/apache/access.log
FIFO=$SQUID_LOG_DIR/fifo
mkfifo $FIFO
if [ -r $FIFO ]
then
while true
do
cat < $FIFO | tee -a $SQUID_LOG_FILE | $squid_convert_log > $APACHE_LOG_FILE
done
else
echo "can't read $FIFO exiting"
exit 192
fi
echo "unexpected end of $SCRIPT";
exit 192
I use a cron job to run a version of this script.
Leave a Reply