(told me I'd won $10.5m)! Now I don't know anyone in Japan (.jp addresses are from Japan) and don't want any email from the place either, so we can add a rule to the SA blacklist that is simply *.jp This, amazingly, will now block ALL mail from any email address ending in .jp - just what we want. Do this for *.in (India), *.ru (Russia) and a few more and your spam drops massively. Be careful and don't get carried away though or you will block wanted mail.
A typical blacklist might be:
Bear in mind that ANY email sourced from the addresses above, even if wanted, will be tagged. So the above rules will block anyone@anyco.in
A typical SA config page for this might look like:
Fine Tuning
It is possible to refine this to achieve near 100% spam detection. In addition to the blacklist, you should be able to invoke a function called use_bayes. Set this to on.
Take a look at the pictures and in particular the bits above use_bayes. Notice there are some lines such as score RDNS_NONE and a box containing (1). Each of the score boxes contains what are called Tokens (such as RDNS_NONE). There are hundreds of these, and they are what SA uses to test various elements of each email. In essence, to see if it is kosher and does not contain adverts or other noxious stuff. Each token is assigned a score by SA, RDNS_NONE has 0.8 under normal circumstances but by adding the value (1), we add 1 to this making the total score 1.8. What this means is that any incoming email that is does not have a proper Reverse DNS address associated with it (and therefore highly likely to be bogus), has its score inflated and is therefore more likely to trigger the spam trap. In our case, we set the threshold to be 3.15 or thereabouts.
The obvious question therefore is: Where did RDNS_NONE come from and how did you know to inflate its value? SA contains hundreds of tokens that are applied to every email received. Any that match or create a score, are listed in the email header information. So in the example above (from Japan), the header contained the following:
To: undisclosed-recipients:;
Subject: **Spam** NOTIFICATION OF PAYMENT US10.5MILLION APPROVED ON YOUR NAME.
Date: Thu, 22 Nov 2012 23:58:27 -0900
Message-Id: <20121123085748.186CA44386@mv-osn-hcb004.ocn.ad.jp>
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16)
X-Spam-Flag: YES
X-Spam-Level: **************************************************
X-Spam-Status: Yes, score=201.4 required=3.1 tests=ADVANCE_FEE_2_NEW_FORM,
ADVANCE_FEE_2_NEW_FRM_MNY,ADVANCE_FEE_2_NEW_MONEY,ADVANCE_FEE_3_NEW,
ADVANCE_FEE_3_NEW_FORM,ADVANCE_FEE_3_NEW_FRM_MNY,ADVANCE_FEE_3_NEW_MONEY,
ADVANCE_FEE_4_NEW,ADVANCE_FEE_4_NEW_FORM,ADVANCE_FEE_4_NEW_FRM_MNY,
ADVANCE_FEE_4_NEW_MONEY,ADVANCE_FEE_5_NEW,ADVANCE_FEE_5_NEW_FORM,
ADVANCE_FEE_5_NEW_FRM_MNY,ADVANCE_FEE_5_NEW_MONEY,
AXB_XMAILER_MIMEOLE_OL_024C2,BAYES_99,DEAR_BENEFICIARY,FILL_THIS_FORM,
FILL_THIS_FORM_FRAUD_PHISH,FILL_THIS_FORM_LOAN,FM_LOTTO_MONEY,
FORGED_MUA_OUTLOOK,FORGED_OUTLOOK_HTML,FORGED_OUTLOOK_TAGS,FORM_FRAUD_5,
FREEMAIL_FORGED_REPLYTO,FREEMAIL_REPLYTO_END_DIGIT,FROM_MISSPACED,
FROM_MISSP_EH_MATCH,FROM_MISSP_MSFT,FROM_MISSP_TO_UNDISC,FROM_MISSP_USER,
FSL_CTYPE_WIN1251,FSL_MISSP_REPLYTO,FSL_NEW_HELO_USER,HTML_MESSAGE,
LOTS_OF_MONEY,MIME_HTML_ONLY,MONEY_FORM,MONEY_FRAUD_3,MONEY_FRAUD_5,
MONEY_FROM_MISSP,MSOE_MID_WRONG_CASE,NA_DOLLARS,NSL_RCVD_FROM_USER,RDNS_NONE,
SPF_PASS,SUBJ_ALL_CAPS,URG_BIZ,USER_IN_BLACKLIST shortcircuit=no
autolearn=spam version=3.3.1
So our spam score was set to 3.1 and the message scored 201.4 - a massive YES THIS IS SPAM!! Notice how there are dozens of tokens involved in scoring the message, each one testing a different aspect. Note that our friend RDNS_NONE was triggered.
Even emails that are delivered normally will likely have a couple of tokens (look in the header) and if you are still getting unwanted emails, either lower your score or chose a token and increase its value. In a spare score box, put the token (careful to copy it exactly) followed by (1) or (2) etc to increase its effect. Note that if you put a number such as 1.5 (no brackets), this will replace the default scoring value with your new one. If you put the 1.5 in brackets, this gets added to the predefined score. Then keep a close watch to make sure good emails are being received.
Advanced
Note that in the setup pictures, Spam Auto Delete and Spam Box are both Disabled. This may seem a bit counter-intuitive (took us a while to figure it). If disabled, email with a score higher than your setting will have **Spam** prefixed before the message subject. It will then be delivered AS NORMAL to your inbox. If you set Spam Auto Delete to Enabled, all your spam messages go to a new mailbox called Spam located somewhere on the server of your web hosting company. You may not even know where this is until, over time, it fills up with junk and then means you excede your quota - web host lingo for suspension of service. No emails at all! To get at this junk folder, you will need to use a webmail application - good luck with that!
A far better option is: set Spam Auto delete to Off. In Outlook Express, set up a mail rule to divert all mail with **Spam** as part of the subject line, to go into a new folder called whatever you like (ours is called spam). Then, ALL the spam mail from all mail accounts ends up in a single folder which you can easily check and empty periodically after checking all mails really are junk. If you DO get any mails incorrectly tagged, either add the specific address to the Whitelist, or find out why it was tagged incorrectly and get it changed. Eventually, when you are confident of the filter accuracy, just dump spam emails automatically.
You will find that all your settings for SA are held in a special file called user_prefs somewhere on your web host server, possibly under the directory .spamassassin - well I never did! You can also use the include function to refer out to another file (somemorerules.cf) and write your very own tokens and rules. Take a look at our user_prefs file (edit with Notepad) and add on cf file. If you search the internet, you can find other examples of .cf files that might be of value. Having said all that, not all web hosts (ours included) allow you to add additional .cf files.
Even More Advanced
It is possible to make this work even better. Doing all the above achieves around 98% success in tagging spam emails. I wanted to take this a stage further though. Two issues remain: Tagged emails get left on the server and over time, add up to a huge mess that needs to be manually deleted and this is a pain. I wanted to automate this but was a little afraid that good emails might get lost as well. The second issue: Large emails are not checked for spam at all!
With the above setup, good emails with a score of 3.1 and below are delivered as normal and most others are tagged as **Spam** BUT, very large emails or those with big attachments are NOT put through SpamAssassin (at least not by our web host) to save server processing time. It turns out that our web host does not filter any message larger than 500KB or an average email with a 350KB attachement. These get straight through. To overcome this and assuming your web host uses Cpanel, setup an email User Level Filter for each email account you wish to clean. Add a new rule, called say Dump Large Emails and test for Any Header. Then set the test to: Does Not Contain and enter the match phrase of: X-Spam-Level: Set the action to whatever you want, be it to trash the mail or return a message "Email too big" etc. Then, any email NOT tested by SpamAssassin will be returned to the sender. Doing it this way allows you to keep one email address that WILL accept large files. Nominate an address and don't add the filter.
Having done that, we then wanted to split emails tagged as **Spam** into two piles. Those with a score up to 10 that might be valid and that we wanted delivered to check manually, and the rest, scoring over 10 that we simply wanted to trash. Sounds easy but... It turns out that the email filters as part of Cpanel don't seem to work properly. The obvious test of Score does not work at all (at least not for us). So setup a new test as above, but this time use an Account Level Filter (works on ALL email addresses on your account) and call it Score Over 10 or similar. Test for Any Header and then Contains and set the search phrase to X-Spam-Level: ********** then select an action such as delete. Putting 10 *'s will find a spam score of 10 and above and delete them all - done. Note also that in some versions of Cpanel, the ability to forward emails after they have been through the filter, does not work (come on lads get this fixed)!
As you might imagine, this document represents a lot of research and work. It has the potential to save you a great deal of time, money and frustration. Please make a donation to show your appreciation if you can. A link to this page might be of value to others experiencing the same problems.
Avoiding Spam
Of course the best strategy is to avoid getting on spam lists in the first place but this may be harder than you think. Be wary of entering your main email address on ANY website, regardless of how much they say they will 'protect' your data. Many operators will cheerfully sign you up for newsletters, gifts or surveys but, what they are really after, is your email address. This is worth real money to them as lists of working email addresses are sold to spammers with little comeback on the seller. So what can be done to avoid this?
To sign up for trial software or newsletters etc, use a throw away email address, one that you can shut down permanently. Use it for six months or until it starts appearing regularly in spam emails and then ditch it for a new address. Having your own domain makes this a piece of cake (easy).
Be very careful of clicking Unsubscribe links unless the firm is genuine. This can be a route to veryfying you are a real person to a spammer = more spam!
Finally and often repeated - NEVER EVER OPEN A LINK FROM WITHIN AN EMAIL. Copy the address and paste into your web browser. Be particularly careful with any strings of information after the real web address as this can, in all probability, be traced to you. So as a precaution, copy just the address bit ending with .com or .us etc. Be EXTREMELY wary of emails that have base64 content. If you examine the email header and find a file in base64, copy it and paste into an online base64 decoder. You might be very surprised at what is in there. Whatever you do though, do not run such files.
Thanks.
We hope you found the site interesting, well presented and above all, informative. Thanks for visiting and please feel free to link to us or recommend us to others. |
|