
Welcome to SpamAssassin!
------------------------

SpamAssassin is a mail filter which attempts to identify spam using text
analysis and several internet-based realtime blacklists.

Using its rule base, it uses a wide range of heuristic tests on mail
headers and body text to identify "spam", also known as unsolicited
commercial email.

Once identified, the mail can then be optionally tagged as spam for later
filtering using the user's own mail user-agent application.

In its most recent test, SpamAssassin differentiated between spam and
non-spam mail correctly in 99.94% of cases.  Since then, it's just been
getting better and better!

SpamAssassin also includes support for reporting spam messages
automatically, and/or manually, to collaborative filtering databases such
as Vipul's Razor [1].

	[1]: http://razor.sourceforge.net/

The distribution provides "spamassassin", a command line tool to perform
filtering, along with "Mail::SpamAssassin", a set of perl modules which
implement a Mail::Audit plugin, allowing SpamAssassin to be used in a
Mail::Audit filter, spam-protection proxy SMTP or POP/IMAP server, or a
variety of different spam-blocking scenarios.

In addition, Craig Hughes has contributed "spamd", a daemonized version of
SpamAssassin, which runs persistently.  Using "spamc", a lightweight C
client, this allows an MTA to process large volumes of mail through
SpamAssassin without having to fork/exec a perl interpreter for each one.

Ian R. Justman has contributed "spamproxy", a spam-filtering SMTP proxy
server.  This lives in the "spamproxy" directory.

SpamAssassin lives at http://spamassassin.org/ or in CPAN, and is
distributed under the same license as Perl itself.

This module owes a lot of inspiration to Mark Jeftovic's filter.plx, which
I used for a long time, and contributed some code to.  However,
SpamAssassin is a ground-up rewrite with a new, greatly improved ruleset,
a different code model and installation system, and hopefully will be easy
to adapt for a multitude of applications.

	[2]: http://AntiSpam.shmOOze.net/filter/

Questions regarding SpamAssassin should be sent to the mailing list:
<spamassassin-talk /at/ lists /dot/ sourceforge /dot/ net>.



Installing SpamAssassin
-----------------------

The easiest way to do this is using CPAN.pm, like so:

	perl -MCPAN -e shell
	o conf prerequisites_policy ask
	install Mail::SpamAssassin
	quit

On Debian, you can apt-get it from unstable, thanks to Duncan Findlay.

Alternatively download the tarfile, zipfile or Red Hat RPM from
http://spamassassin.org/ , and install that, like so:

	[unzip/untar the archive]
	cd Mail-SpamAssassin-*
	perl Makefile.PL
	make
	make install				[as root]

SunOS Note
----------

Under SunOS, snprintf is not defined.  A library containing a SunOS
version of snprintf is included in contrib/snp.tar.gz, and that should be
useable for building SA.  If you're a SunOS user and do this, and would
like to document the process for others, please send the docs to the SA
maintainer to amend these basic directions.


Installing SpamAssassin for Personal Use (Not System-Wide)
----------------------------------------------------------------

These steps assume the following, so substitute as necessary:
  - Your UNIX login is "user"
  - Your home directory is /home/user
  - The location of the procmail executable is /usr/bin/procmail

1. Uncompress the SpamAssassin archive

2. Move/rename the created SpamAssassin directory where you want to
permanently place it in your user directory:
    mv Mail-SpamAssassin-2.1 ~/bin/SpamAssassin

3. Make SpamAssassin as normal ("perl Makefile.PL", "make")

4. If you already use procmail, skip to step 6.  If not, ensure procmail
is installed using "which procmail" or install it from www.procmail.org.

5. Create a .forward file in your home directory containing the below
lines:

"|IFS=' ' && exec /usr/bin/procmail -f- || exit 75 #user"

6. Edit or create a .procmailrc file in your home directory containing
the below lines.  If you already have a .procmailrc file, add the lines
to the top of your .procmailrc file:

:0fw
| /home/user/bin/SpamAssassin/spamassassin -P -c /home/user/bin/SpamAssassin/rules

  The above line filters all incoming mail through SpamAssassin and tags
probable spam with a unique header.  If you would prefer to have spam
blocked and saved to a file called caughtspam in your home directory
instead of passed through and tagged, append this directly below the
above lines:

:0:
* ^X-Spam-Status: Yes
caughtspam

7. Now, you should be ready to send some test emails and ensure everything
works as expected.  First, send yourself a test email that doesn't contain 
anything suspicious.  You should receive it normally, but there will be a
header containing X-Spam-Status: No.  If you are only tagging your spam,
send yourself an obvious spam mail and check to be sure it is marked as
spam.  If your test emails don't get through to you, immediately rename
your .forward file until you figure out cause of the the problem, so you
don't lose incoming email.

Optional Additional Modules
---------------------------

In addition, the following modules will be used for some checks, if
available.  If they are not available, SpamAssassin will still work, just
not as effectively -- some of the spam-detection tests will have to be
skipped.


  - Net::DNS	(from CPAN)

    Used to check the RBL, RSS, DUL etc. and perform MX checks.
    Recommended.

	perl -MCPAN -e shell
	o conf prerequisites_policy ask
	install Net::DNS
	quit


  - Razor	http://razor.sourceforge.net/

    Used to check message signatures against Vipul's Razor collaborative
    filtering network.  Razor is not available from CPAN -- you have to
    download it from the URL above.

    Razor has a large number of dependencies on CPAN modules.  Feel free
    to skip installing it, if this makes you nervous; SpamAssassin will
    still work well without it.


  - Mail::Audit, Mail::Internet, Net::SMTP	(from CPAN)

    If you want to use SpamAssassin for local delivery to a qmail or
    MailDir spool, and you do *not* want to use procmail for some reason,
    you will need to install the Mail::Audit module, and any modules it
    requires (there's lots of them, unfortunately).   This is no longer
    recommended.

    If you use procmail, KMail, or you plan to use 'spamd', you will *not*
    need these.

    Here's how to install them using CPAN.pm:

	perl -MCPAN -e shell
	o conf prerequisites_policy ask
	install Mail::Audit
	quit



Using SpamAssassin
------------------

Steps to take for every installation:

  - Install Mail::SpamAssassin on your mail server, as above.

  - Test it:

      spamassassin -t < sample-nonspam.txt > nonspam.out
      spamassassin -t < sample-spam.txt > spam.out

    Verify (using a text viewer, ie. "less" or "notepad") that nonspam.out
    has not been tagged as spam, and that spam.out has.  The files should
    contain the full text and headers of the messages, the "spam.out"
    message should be annotated with "****SPAM****" in the subject line
    and a report from SpamAssassin, and there should be no errors when you
    run the commands.

    Even though sample-nonspam.txt is not spam, nonspam.out will contain a
    SpamAssassin report anyway.  This is a side-effect of the "-t" (test)
    switch.  However, there should be less than 5 hits accumulated; when
    the "-t" switch is not in use, the report text would not be added.

    If the commands do not work, DO NOT PROCEED TO THE NEXT STEP, as you
    will lose mail!



If you use Mail::Audit already:

  - run "perldoc Mail::SpamAssassin" and take a look at the synopsis, it
    outlines what you need to add to your audit script.

  - Copy the configuration files (see CUSTOMISING, below) to a known
    location, so your script can set the appropriate options for the
    Mail::SpamAssassin constructor to load them.



If you use KMail:

  - http://kmail.kde.org/download.html mentions:

    The filter setup is the work of five minutes (if that!) if you have a
    working spamassassin set up.

    The filter in question is "<any header><matches regexp> ."

    The action is "<pipe through> spamassassin -P"

    Then, in the advanced options, uncheck the "If this filter matches,
    stop processing here" box. If you keep this filter at the top, it will
    analyze any incoming mail, decide whether it's spam or not, and flag
    it accordingly.

    [Then add] a second filter behind it, which searches for the added
    spam-flags and diverts them into a specific spam folder. [...]



If you're using procmail:

  - Make a backup of your .procmailrc (if it exists).

      cp ~/.procmailrc ~/.procmailrc.bak

  - add the line from procmailrc.example to ~/.procmailrc, at the top of
    the file before any existing recipes.

    That'll process all mail through SA, and refile spam messages to
    a folder called "caughtspam" in your home directory.

  - Send yourself a mail message, and ensure it gets to you.  If it does
    not, copy your old backed-up .procmailrc file back into place and ask
    your sysadmin for help!  Here's commands to do that:

      cp ~/.procmailrc.bak ~/.procmailrc
      echo "Help!" | mail root



If you want to use SpamAssassin site-wide:

  - take a look at the notes on the website, at
    http://spamassassin.org/sitewide.html .  You may want to use
    'spamd' (see below).

  - another option is Ian R. Justman's "spamproxy", a spam-filtering SMTP
    proxy server.  This lives in the "spamproxy" directory.



If you don't use any mail filter just yet, and want to let SpamAssassin
handle local mail delivery:

  - **BEFORE YOU DO THIS** consider using procmail instead (see above),
    it's recommended for this situation, and less bugs have been reported
    with it.  Notably, use of SpamAssassin with an NFS-mounted mail spool
    is **NOT SAFE**.  Please use procmail!

  - Make a backup of your .forward (if it exists).

      cp ~/.forward ~/.forward.bak

  - Change your ~/.forward file so it reads like this:

      "| spamassassin || exit 75"

  - Send yourself a mail message, and ensure it gets to you.  If it does
    not, copy your old backed-up .forward file back into place and ask
    your sysadmin for help!  Here's commands to do that:

      cp ~/.forward.bak ~/.forward
      echo "Help!" | mail root

  - Send yourself sample-spam.txt and make sure it gets tagged:

      /usr/sbin/sendmail yourusername < sample-spam.txt




Other installation notes:

  - If you get spammed, it is helpful to everyone else if you re-run
    spamassassin with the "-r" option to report the message in question as
    "verified spam".  This will add it to Vipul's Razor
    (http://razor.sourceforge.net/), a collaborative spam filtering
    network, if you've installed the Razor modules.

      spamassassin -r < spam-message

    If you use mutt as your mail reader, this macro will bind the X key to
    report a spam message.

      macro index X "| spamassassin -r"

    This is, of course, optional -- but you'll get lots of good-netizen
    karma. ;)


  - Quite often, if you've been on the internet for a while, you'll have
    accumulated a few old email accounts that nowadays get nothing but
    spam.  You can set these up as spam traps using SpamAssassin; see the
    ''SPAM TRAPPING'' section of the spamassassin manual page for details.

    If you don't want to go to the bother of setting up a system yourself
    to do this, feel free to set up a simple alias to forward any mails to
    <someaddress@spamtraps.taint.org> -- replace "someaddress" with
    something to identify you, such as your email addr or website with
    non-alphanumeric chars replaced by underscores, or similar.   This
    will feed it into my spam-trapping system running on taint.org, where
    it will be fed into Razor.


  - The distribution now includes 'spamd', a daemonized version of the
    perl script, and 'spamc', a low-overhead C client for this,
    contributed by Craig R. Hughes.  This greatly reduces the overhead of
    checking large volumes of mail with SpamAssassin.  Take a look in the
    'spamd' directory for more details.


  - There's also Ian R. Justman's "spamproxy", a spam-filtering SMTP
    proxy server.  This lives in the "spamproxy" directory.


  - Scores and other user preferences can now be loaded from an SQL
    database; see the 'sql' subdirectory for more details.


  - Edward Fang <edfang /at/ visi.net> has contributed the
    'communigate.sh' script for CommunigatePro (see the 'contrib'
    directory).

  - James Henstridge <james /at/ daa.com.au> has contributed an LMTP proxy
    server (designed for Cyrus, but probably will work fine with others),
    again it's in the contrib directory.



Customising
-----------

These are the configuration files installed by SpamAssassin.  The commands
that can be used therein are listed in the POD documentation for the
Mail::SpamAssassin::Conf class (see the 'doc' directory).

  - /usr/share/spamassassin/*.cf:

	Distributed configuration files, with all defaults.  Do not modify
	these, as they are overwritten when you upgrade.

  - /etc/mail/spamassassin/*.cf:

  	Site config files, for system admins to create, modify, and
	add local rules and scores to.  Modifications here will be
	appended to the config loaded from the above directory.

  - /usr/share/spamassassin/user_prefs.template:

	Distributed default user preferences. Do not modify this, as it is
	overwritten when you upgrade.

  - /etc/mail/spamassassin/user_prefs.template:

	Default user preferences, for system admins to create, modify, and
	set defaults for users' preferences files.  Takes precedence over
	the above prefs file, if it exists.

	Do not put system-wide settings in here; put them in the
	/etc/mail/spamassassin directory.  This file is just a template,
	which will be copied to a user's home directory for them to
	change.

  - $USER_HOME/.spamassassin:

  	User state directory.  Used to hold spamassassin state, such
	as a per-user automatic whitelist, and the user's preferences
	file.

  - $USER_HOME/.spamassassin/user_prefs:

  	User preferences file.  If it does not exist, one of the 
	default prefs file from above will be copied here for the
	user to edit later, if they wish.

	Unless you're using spamd, there is no difference in
	interpretation between the rules file and the preferences file, so
	users can add new rules for their own use in the
	"~/.spamassassin.cf" file, if they like.  (spamd disables this for
	security and increased speed.)

Take a look at the "Mail_SpamAssassin_Conf.txt" file in the "doc"
directory to see what can be set.  Common first-time tweaks include:

  - required_hits

	Set this higher to make SpamAssassin less sensitive.

  - rewrite_subject

  	Turn off Subject-line rewriting with this.

  - subject_tag

        When rewrite_subject is on, the subject stamp is *****SPAM*****.
        This can be used to change it.

  - ok_locales

	If you expect to receive mail in non-ISO-8859 character sets (ie.
	Chinese, Cyrillic, Japanese, Korean, or Thai) then set this.

  - defang_mime

	By default, SpamAssassin will 'de-fang' MIME messages, turning
	them into content-type text/plain.  This will turn that behaviour
	off.


Locali[sz]ation
---------------

All text displayed to users is taken from the configuration files.  This
means that you can translate messages, test descriptions, and templates
into other languages.

If you do so, I would *really* appreciate if you could send a copy back of
the updated messages; mail them to
<spamassassin-talk@lists.sourceforge.net> .  Hopefully if it takes off, I
can add them to the distribution as "official" translations and build in
support for this.  You will, of course, get credited for this work ;)



Help With SpamAssassin
----------------------

There's a mailing list for support or discussion of SpamAssassin.  It
lives at <spamassassin-talk@lists.sourceforge.net>.  See
http://spamassassin.org/lists.html for the sign-up address and a
link to the archive of past messages.



Commercial Tests
----------------

There are several tests in the spamassassin configuration file which are
turned off by default, namely the mail-abuse.org and bl.spamcop.net tests.
The mail-abuse.org tests are RCVD_IN_RBL, RCVD_IN_RSS, and RCVD_IN_DUL;
the bl.spamcop.net test is called RCVD_IN_BL_SPAMCOP_NET.

These are commercial services, so you need to pay money to use them.
Having said that, the bl.spamcop.net service gets my recommendation as the
most useful blacklisting DNS service I've found.  More information on it
can be found at http://spamcop.net/bl.shtml .

The mail-abuse.org tests are free for personal use, for now -- so if
you're using SpamAssassin as a personal mail filter you may turn them on.
More information on the mail-abuse.org services can be found here:
http://mail-abuse.org/rbl+/ and
http://www.mail-abuse.org/feestructure.html .

To turn on the tests, simply assign them a non-zero score, e.g. by adding
these lines to your ~/.spamassassin.cf file:

    score RCVD_IN_RBL               10
    score RCVD_IN_RSS               1
    score RCVD_IN_DUL               1
    score RCVD_IN_BL_SPAMCOP_NET    4



Automatic Whitelist System
--------------------------

SpamAssassin includes automatic whitelisting; The current iteration is
considerably more complex than the original version.  The way it works is
by tracking for each sender address the average score of messages so far
seen from there.  Then, it combines this long-term average score for the
sender with the score for the particular message being evaluated, after
all other rules have been applied.

This functionality is off by default, and is enabled with the "-a" flag to
either spamassassin or spamd.

A system-wide auto-whitelist can be used, by setting the
auto_whitelist_path and auto_whitelist_file_mode configuration commands
appropriately, e.g.

    auto_whitelist_path        /var/spool/spamassassin/auto-whitelist
    auto_whitelist_file_mode   0666

The spamassassin -W and -R command line flags provide an API to add and
remove entries 'manually', if you so desire.  They operate based on an
input mail message, to allow them to be set up as aliases which users can
simply forward their mails to.  See the spamassassin manual page for more
details.

The default address-list implementation,
Mail::SpamAssassin::DBBasedAddrList, uses Berkeley DB files to store the
addresses.  There may be synchronization issues with this implementation
in an NFS environment.  Reasonable attempts have been made to ensure
proper locking of the DB file, but it may yet be somewhat flakey.



(end of README)

// vim:tw=74:
