Search My Blog

Thursday, September 2, 2010

SpamBayes: Unix/Linux platform

Spambayes on Unix or Linux

There is no direct mail client integration on Unix and Linux systems1. You must make sure a recent enough version of Python is installed, then install the Spambayes source either as a bundled package or from Subversion, then choose the Spambayes application which best fits into your mail setup.

Procmail

If you use procmail as your local delivery agent and your email package picks up your primary mail from a local spool file (e.g. /var/spool/mail), you will probably find sb_filter.py the easiest application to integrate into your mail environment.

An example setup is as follows (thanks to Alister Forbes for contributing this). Note that the path to the sb_filter script will be the same as the prefix you used when installing Python, which is probably /usr when you installed Python using your OS's package management software, and is more likely to be /usr/local if you built your own. You can refer to the output of "setup.py install" to find the location.

  1. Install spambayes with the usual
    setup.py install
  2. Create the database that spambayes will use to test your incoming mail:
    /usr/local/bin/sb_filter.py -d $HOME/.hammie.db -n
  3. Train it on your existing mail. This is optional, but a good idea. -g is the flag for the known good mail, and -s is for known spam
    /usr/local/bin/sb_mboxtrain.py -d $HOME/.hammie.db -g $HOME/Mail/inbox -s $HOME/Mail/spam
  4. Adding the following recipes to the top of your .procmailrc will get the spam and unsure stuff out of the way. Allowing everything else to be filtered as per your normal procmail recipes.
          :0fw:hamlock       | /usr/local/bin/sb_filter.py -d $HOME/.hammie.db        :0       * ^X-Spambayes-Classification: spam        ${MAILDIR}/spam              :0       * ^X-Spambayes-Classification: unsure       ${MAILDIR}/unsure     
  5. For ongoing training, there is a handy little cronjob that will run at 2:21 am every morning, training the database with good mail from Inbox and spam from caughtspam. Just add the following to your crontab
    21 2 * * * /usr/local/bin/sb_mboxtrain.py -d $HOME/.hammie.db -g $HOME/Mail/Inbox -s $HOME/Mail/spam

Additional details are available in the README file.

POP3

If your mail program fetches mail using POP3, then you should run sb_server.py.

You might wish to set sb_server.py to run as a daemon - Fernando Nino and Dave Handley have provided these scripts (respectively) which will allow you to do this:

#!/bin/bash # # spambayes:    Starts the spam filter as a pop3 proxy # # Version:      @(#) /etc/init.d/spambayes 1.0 # # chkconfig: - 95 21 # description: This shell script takes care of starting and stopping \ #              spambayes pop3 proxy # processname: sb_server.py # # Source function library. . /etc/init.d/functions  SBPROXY=/opt/bin/sb_server.py SBLOG=/var/log/spam.log SBDIR=/opt/sb_data  [ -x $SBPROXY ] || exit 0  RETVAL=0  start () {     date >> $SBLOG     echo -n "Starting SpamBayes POP3 proxy: "     if [ ! -d $SBDIR ] ; then       echo "Repertoire $SBDIR non present" >> $SBLOG       RETVAL=1     else        cd $SBDIR       ($SBPROXY 2>&1 >> $SBLOG) &       RETVAL=$?     fi     action "" [ $RETVAL = 0 ]     return $RETVAL }  stop () {     # stop daemon     date >> $SBLOG     echo -n "Stopping SpamBayes POP3 proxy: "     killproc $SBPROXY 1     RETVAL=$?     echo        [ $RETVAL = 0 ]      return $RETVAL }  restart () {     stop     start     RETVAL=$?     return $RETVAL }  # See how we were called. case "$1" in     start) 	start 	;;     stop) 	stop 	;;     status) 	status $SBPROXY 	RETVAL=$? 	;;     restart) 	restart 	;;     *)         echo "Usage: $0 {start|stop|restart|status}"         RETVAL=1 esac  exit $RETVAL 

#!/bin/sh # # spamd         This shell script takes care of starting and stopping #               the spambayes deamon. # # Author:       Dave Handley # Date:         11 Oct 03 #  # Source function library. . /etc/rc.d/init.d/functions  # Source networking configuration. . /etc/sysconfig/network  RETVAL=0  # See how we were called. case "$1" in   start) 	# Start daemons. 	echo -n "Starting spamd: " 	cd /etc/spamd/ 	daemon /usr/local/bin/sb_server.py & 	RETVAL=$? 	echo 	;;   stop) 	# Stop daemons. 	echo -n "Shutting down spamd: " 	killproc sb_server.py 	RETVAL=$? 	echo 	;;   restart|reload) 	$0 stop 	$0 start 	RETVAL=$? 	;;   status) 	status sb_server.py 	RETVAL=$? 	;;   *) 	echo "Usage: spamd {start|stop|restart|status}" 	exit 1 esac  exit $RETVAL 

Thunderbird

Thunderbird users might find the ThunderBayes extension useful. It provides tighter integrateion between Thunderbird and the SpamBayes POP3 proxy.

KMail

Toby Dickenson has written a description of his SpamBayes and KMail setup (using sb_bnfilter.py), which is an effective guide to setting up your system if you are a KMail user.

IMAP

If your mail program fetches mail using IMAP, then you should run imapfilter.py.

Training

See the README filefor a detailed discussion of the many training options on Unix systems.

Notes

  1. If you're a Unix weenie using a Mac OS X system, this page is probably more appropriate than the Mac page.

exmh

The following short guide will help you set up a new message menu on exmh - this adds a menu containing "Train as Spam" and "Train as Ham" options.

  1. First of all, create the directory ~/.tk/exmh if you haven't already done so. Put the following file (I call mine spambayes.tcl) in there:
  2. proc SB_SpamTrain { } {     global exmh msg mhProfile      Ftoc_Iterate line {         set msgid [ Ftoc_MsgNumber $line ]         eval {MhExec sb_filter.py -s $mhProfile(path)/$exmh(folder)/$msgid }      } }  proc SB_HamTrain { } {     global exmh msg mhProfile      Ftoc_Iterate line {         set msgid [ Ftoc_MsgNumber $line ]         eval {MhExec sb_filter.py -g $mhProfile(path)/$exmh(folder)/$msgid }      } }   
  3. Then run 'wish' or 'tclsh', and enter the following command:
  4. auto_mkindex ~/.tk/exmh *.tcl   
  5. Next, we hook up the commands that we just created. Shut down your exmh and edit ~/.exmh/exmh-defaults. Add the following entries:
  6. *Mops.umenulist: spam *Mops.spam.text: S-B *Mops.spam.m.entrylist: trainspam trainham  *Mops.spam.m.l_trainspam: Train as Spam *Mops.spam.m.c_trainspam: SB_SpamTrain *Mops.spam.m.l_trainham: Train as Ham *Mops.spam.m.c_trainham: SB_HamTrain   
  7. Restart exmh, and you're done.

Go there...
http://spambayes.sourceforge.net/unix.html

Don

No comments: