How do spam filters work…?

Many spam filters work by analysing the content of a message: the message subject, body, and attachments. Spammers today expend significant resources on developing content which will evade content filters.
 
Originally, spam was simple: identical messages were sent to everyone on a mailing list. These emails were laughably easy to filter out due to the quantity of identical texts. Spammers then began to include a greeting based on the recipient’s address. Since every message now contained a personalised greeting, filters which blocked identical messages did not detect this type of spam. Security experts developed filters that identified unchanging lines, which would then be added to filtration rules. They also developed fuzzy signature matching, which would detect text which only had minor changes, and statistic based self-modifying filtration technologies. Spammers now often place either text strings from legitimate business emails, or random text strings at the beginning or end of emails in order to evade content filters. Another method used to evade filters is to include invisible text in HTML-format emails: the text is either too tiny to see or the font colour matches the background.

Both methods are fairly successful against content and statistical filters. Analysts responded by developing search engines that scanned emails for such typical texts, which also conducted detailed HTML analysis and sophisticated content analysis. Many antispam solutions were able to detect such tricks without even analysing the content of individual emails in detail.

Sending spam in graphics format makes it very hard to detect. Analysts are developing methods for extracting and analyzing text contained in graphics files.

A single advertisement can be endlessly rephrased, making each individual message appear to be a legitimate email. As a result, antispam filters have to be configured using a large number of samples before such messages can be detected as spam.

Currently, spammers usually use the last three methods in a variety of combinations. Many antispam solutions are incapable of detecting all three. As long as spamming remains profitable, users with poor-quality antispam software will continue to find their mailboxes clogged with advertising.