| How to enable bayes Autolearning |
|
|
|
| Written by Bill Olson |
| Sunday, 12 April 2009 01:50 |
|
Updated 2/9/10: Updated links so they worked correctly. Removed the references for maildrop and procmail spam filtering as they are now obsoleted. Once you have trained bayes with 200 hams and spams, you then need to
There are a few things you need to train spamassassin to do before bayes can start learning how to tell the difference between spam and non-spam. The more you train bayes, the better the learning algorythim. Before continuing on I want to let you know about 1 thing. If you are running the freebsdrocks spamd service, you do not have to change spamd to a non-root user. The service is configured to run as user qscand. Please skip down to the section that starts First to make sure bayes can be turned on, bayes needs to be trained for 200 hams and 200 spams. Run the following command: Before starting with Bayes, one of the things I would suggest is running SpamAssassin as a non-root account. You can do this by adding an option to the spamd.sh startup script. Edit your SpamAssassin startup script and look for the following line. I give 2 different options depends on what version of SpamAssassin you're running. Option 1: spamd_flags=${spamd_flags:-"-d -x -r ${spamd_pidfile} "} and modify it. Add the -u qscand to make SpamAssassin run as user qscand: Option 1: spamd_flags=${spamd_flags:-"-u qscand -d -x -r ${spamd_pidfile} "} The path or flags file may vary from system to system. When you are done, save and exit and restart spamd. All your spamd processes should now run as qscand. First to make sure bayes can be turned on, bayes needs to be trained for 200 hams and 200 spams. Run the following command: # sa-learn --dump magic 0.000 0 5752 0 non-token data: nspam As you can see from the above example, I have 5752 spams and 1702 hams The nspam total is the total amount of spams Bayes has learned. Here is how to train SpamAssassin hams and spams. There are a few ways to feed sa-learn spams and hams. The easiest way is # sa-learn --spam ~vpopmail/domains/domain.ext/user/Maildir/.Spam/new To learn hams in ~vpopmail/domains/domain.ext/user/Maildir/new, run # sa-learn --ham ~vpopmail/domains/domain.ext/user/Maildir/new You'll get an output similar to the following in wither either case. Actual messages numbers may vary. Learned from 30 message(s) (30 message(s) examined). This tells you that out of 30 messages in the new folder, 30 were learned. If you run sa-learn --dump magic, your nspam total will have 30 more new messages learned as spam. You basically need 200 hams and 200 spams before you can enable bayes autolearning. Once you have done that, add the following lines to your local.cf # The line below needs to point to the users bayes_path that spamassassin runs as. In this case, the qscand home folder is /tmp The first line tells the bayes path to tell bayes where to store the bayes database. The next line enables bayes. The next line after that enables autolearning. and the next line just forces a chmod of 770 on the bayes database for security reasons. Restart spamd and within a day or so you will see autolearn appear in your headers. I am not sure why it takes so long for it to come into the header part of the emails. It just does for some reason. |
| Last Updated on Friday, 23 April 2010 19:19 |