Add to Technorati Favorites

Stagnant email address arms race

I like arms races. But in an interesting arms race there’s frequent movement on both sides.

So I’m often surprised that the measures people and programs take to obscure email addresses haven’t changed much over the last 5(?) years.

There are still many software packages and web sites that do the bare minimum to obscure email addresses. For example, here’s a recent interesting posting from Vaughan Pratt on Simple Turing machines, Universality, Encodings, etc. The mailing list software is the extremely popular Mailman system. Vaughan’s email is “obscured” as pratt at cs.stanford.edu. That approach is so old it can hardly be counted as more challenging for a spammer to harvest than if mailman had simply included the actual address.

And mailman is just one example. People do it too, using extremely transparent and repetitive schemes, like joe AT xyz DOT com.

Given how much people dislike spam, how easy the above examples are to extract, and how creative humans can be, I find it amazing that the practice of obscuring emails addresses has barely moved in the last years. Do you suppose the spammers are standing still? Well, maybe they are, given the lack of advance on the obscuring side.

There is some ingenuity, like using terryblah@flahmydomain.com accompanied by an instruction (in English) to remove all instances of blah and flah to get the real address. Given that humans are so creative with language and that NLP doesn’t stand a snowball’s chance in hell, you’d think the humans would have little trouble staying ahead in this race. But right now I expect the address harvesters have the upper hand.

Here’s another example.

To get my personal email address, join the second to the last four letters of strawberry, add an at sign, add the tenth letter, then put on “on”, then a period. You get the final part by dropping the last letter of the acronym for Eastern Standard Time.

I.e., that’s “terry” plus “@” plus “j” plus “on” plus “.” plus “es”. Yes, this is overkill, but it illustrates how easy it is to create highly personalized but simple instructions for a human to follow that no program is ever going to handle. Even if an attack on the above could be automated, it’s clearly not worth the cost just to get one email address.

Surely it’s time to move on.


You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

2 Responses to “Stagnant email address arms race”

  1. You are right, it’s one of those cases where individual creativity is much better than standard approaches.

    I’ve been using this address on Usenet for years now:

    nico-NoSp@am-teknico.net.invalid

    However, the address on my website is automatically mangled by the docutils ReStructuredText generation code:

    http://teknico.net/about/author/index.en.html

    It may be clever, but if a browser can show the address correctly, surely spammer tools can understand it too.

    I’ll change it soon.

  2. You are right, it’s one of those cases where individual creativity is much better than standard approaches.

    I’ve been using this address on Usenet for years now:

    nico-NoSp@am-teknico.net.invalid

    However, the address on my website is automatically mangled by the docutils ReStructuredText generation code:

    http://teknico.net/about/author/index.en.html

    It may be clever, but if a browser can show the address correctly, surely spammer tools can understand it too.

    I’ll change it soon.