Discussion:
x-spam-status bayes tests
Timo Proescholdt
2004-01-10 14:31:37 UTC
Permalink
Hi,


I recently enabeled the bayesian classifier in amavisd.
Now reviewing the results, i notice that the BAYES test
results are not showing up in the X-Spam-Status header
of some SPAMS.

The X-spam-status header itself is inserted very well,
i can see other tests succeed.

Am i missunderstanding the BAYES filtering principle,
or should the BAYES tests produce a result (positive or
negative) in either case?

Best Regards and
many thanks
--
Timo



-------------------------------------------------------
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
_______________________________________________
AMaViS-user mailing list
AMaViS-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/
Steve
2004-01-10 17:42:12 UTC
Permalink
Post by Timo Proescholdt
Hi,
I recently enabeled the bayesian classifier in amavisd.
Now reviewing the results, i notice that the BAYES test
results are not showing up in the X-Spam-Status header
of some SPAMS.
The X-spam-status header itself is inserted very well,
i can see other tests succeed.
Am i missunderstanding the BAYES filtering principle,
or should the BAYES tests produce a result (positive or
negative) in either case?
Bayes has the disadvantage of having to be trained for a period of time
before even being used, much less effective.

If I recall the defaults correctly, SA will automatically train it's
bayes with any messages that score above 12 or less than -2. (These can
be adjusted via the SA config file) After it has X number of messages in
it's database, it will start comparing incoming messages to the database
and adjusting the spam score accordingly.

If you run amavisd debug-sa the output will report the number of
messages in the database, both ham and spam, giving you an idea of when
it will start to become more prominent in your spam results.

You will find that a LOT of the messages you recieve will fall between
-2 and 12, so they never make it into your database. This is done to
keep the database as free of false positives and false negatives as
possible. The drawback to this is that it can take several days, weeks,
or maybe even months (depending on your mail volume and the scores they
generate) before the Bayes tests actually start being used.

You can speed this up, by manually feeding orginal spam and ham messages
into SA via sa-learn. Of course, you should only need to do that with
messages that fall within the -2 and 12 threshold as all the others
should be learned automatically.

I hope this helps, but the docs on www.spamassassin.org will probably
cover it better than I can.

Steve



-------------------------------------------------------
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
_______________________________________________
AMaViS-user mailing list
AMaViS-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/
Timo Proescholdt
2004-01-10 18:28:49 UTC
Permalink
Post by Steve
Post by Timo Proescholdt
Hi,
Hi,
Post by Steve
Post by Timo Proescholdt
I recently enabeled the bayesian classifier in amavisd.
Now reviewing the results, i notice that the BAYES test
results are not showing up in the X-Spam-Status header
of some SPAMS.
The X-spam-status header itself is inserted very well,
i can see other tests succeed.
Am i missunderstanding the BAYES filtering principle,
or should the BAYES tests produce a result (positive or
negative) in either case?
Bayes has the disadvantage of having to be trained for a period of time
before even being used, much less effective.
If I recall the defaults correctly, SA will automatically train it's
bayes with any messages that score above 12 or less than -2. (These can
be adjusted via the SA config file) After it has X number of messages in
it's database, it will start comparing incoming messages to the database
and adjusting the spam score accordingly.
thanks for your answers.
But i am afraid i did not make my point clearly.
I trained the filter correctly and there *are* messages that get inserted
the BAYES_XX header.
I was wondering why *some* messages do not have the BAYES_XX header,
although the x-spam-header was inserted. (only the BAYES_XX header missing).

any hints?

timo


-------------------------------------------------------
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
_______________________________________________
AMaViS-user mailing list
AMaViS-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/
Thomas (maillists)
2004-01-10 18:31:55 UTC
Permalink
Post by Timo Proescholdt
I trained the filter correctly and there *are* messages that get inserted
the BAYES_XX header.
I was wondering why *some* messages do not have the BAYES_XX header,
although the x-spam-header was inserted. (only the BAYES_XX header missing).
Because they don't receive a high enough BAYES-rating?


thomas



-------------------------------------------------------
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
_______________________________________________
AMaViS-user mailing list
AMaViS-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/
Steve
2004-01-10 21:04:09 UTC
Permalink
Post by Timo Proescholdt
Post by Steve
Post by Timo Proescholdt
I recently enabeled the bayesian classifier in amavisd.
Now reviewing the results, i notice that the BAYES test
results are not showing up in the X-Spam-Status header
of some SPAMS.
The X-spam-status header itself is inserted very well,
i can see other tests succeed.
Am i missunderstanding the BAYES filtering principle,
or should the BAYES tests produce a result (positive or
negative) in either case?
Bayes has the disadvantage of having to be trained for a period of time
before even being used, much less effective.
If I recall the defaults correctly, SA will automatically train it's
bayes with any messages that score above 12 or less than -2. (These can
be adjusted via the SA config file) After it has X number of messages in
it's database, it will start comparing incoming messages to the database
and adjusting the spam score accordingly.
thanks for your answers.
But i am afraid i did not make my point clearly.
I trained the filter correctly and there *are* messages that get inserted
the BAYES_XX header.
I was wondering why *some* messages do not have the BAYES_XX header,
although the x-spam-header was inserted. (only the BAYES_XX header missing).
any hints?
timo
Eww, that is a good question. I would guess that the message didn't have
any (or enough) tokens in the database to generate a baysian score.

That's just a guess, however, you will probably get a better response
from the spamassassin list, since they are no doubt more intimate with
it's workings.



-------------------------------------------------------
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
_______________________________________________
AMaViS-user mailing list
AMaViS-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/
Sven Schuster
2004-01-10 17:47:06 UTC
Permalink
Hi Timo,

the bayes database at first has to learn what is spam and what is ham.
You can do this by using sa-learn --spam / sa-learn --spam.
I think bayes filtering results are only used if you have fed at least
200 spam to the bayes database. This can be checked out by running

amavisd --debug-sa

Maybe this is your problem??

HTH

Sven
Post by Timo Proescholdt
Hi,
I recently enabeled the bayesian classifier in amavisd.
Now reviewing the results, i notice that the BAYES test
results are not showing up in the X-Spam-Status header
of some SPAMS.
The X-spam-status header itself is inserted very well,
i can see other tests succeed.
Am i missunderstanding the BAYES filtering principle,
or should the BAYES tests produce a result (positive or
negative) in either case?
Best Regards and
many thanks
--
Timo
--
Linux zion 2.6.1-rc3 #1 Thu Jan 8 23:34:25 CET 2004 i686 athlon i386 GNU/Linux
18:46:06 up 1 day, 18:58, 3 users, load average: 0.00, 0.00, 0.00
Mark Martinec
2004-01-12 16:06:26 UTC
Permalink
Timo,

| I recently enabeled the bayesian classifier in amavisd.
| Now reviewing the results, i notice that the BAYES test
| results are not showing up in the X-Spam-Status header
| of some SPAMS.
|
| The X-spam-status header itself is inserted very well,
| i can see other tests succeed.
|
| Am i missunderstanding the BAYES filtering principle,
| or should the BAYES tests produce a result (positive or
| negative) in either case?

The X-Spam-Status includes (in 'tests=...') the list of tests that
were triggered, as reported by $per_msg_status->get_names_of_tests_hit,
and I'm regularly seing log entries containing BAYES_xx:

spam_scan: hits=9.877 tests=BAYES_99,CLICK_BELOW,...

which should also end up in mail header of the passing mail.

Don't know what is happening in your case. Are bayes tests
reported by SA when you run 'amavisd debug-sa' ?
Are they included in the list when command-line spamassassin
is run (as user amavis) on the same message?

Mark


-------------------------------------------------------
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
_______________________________________________
AMaViS-user mailing list
AMaViS-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Continue reading on narkive:
Loading...