Stagnant email address arms race
Wednesday, October 31st, 2007I like arms races. But in an interesting arms race there’s frequent movement on both sides.
So I’m often surprised that the measures people and programs take to obscure email addresses haven’t changed much over the last 5(?) years.
There are still many software packages and web sites that do the bare minimum to obscure email addresses. For example, here’s a recent interesting posting from Vaughan Pratt on Simple Turing machines, Universality, Encodings, etc. The mailing list software is the extremely popular Mailman system. Vaughan’s email is “obscured” as pratt at cs.stanford.edu. That approach is so old it can hardly be counted as more challenging for a spammer to harvest than if mailman had simply included the actual address.
And mailman is just one example. People do it too, using extremely transparent and repetitive schemes, like joe AT xyz DOT com.
Given how much people dislike spam, how easy the above examples are to extract, and how creative humans can be, I find it amazing that the practice of obscuring emails addresses has barely moved in the last years. Do you suppose the spammers are standing still? Well, maybe they are, given the lack of advance on the obscuring side.
There is some ingenuity, like using terryblah@flahmydomain.com accompanied by an instruction (in English) to remove all instances of blah and flah to get the real address. Given that humans are so creative with language and that NLP doesn’t stand a snowball’s chance in hell, you’d think the humans would have little trouble staying ahead in this race. But right now I expect the address harvesters have the upper hand.
Here’s another example.
To get my personal email address, join the second to the last four letters of strawberry, add an at sign, add the tenth letter, then put on “on”, then a period. You get the final part by dropping the last letter of the acronym for Eastern Standard Time.
I.e., that’s “terry” plus “@” plus “j” plus “on” plus “.” plus “es”. Yes, this is overkill, but it illustrates how easy it is to create highly personalized but simple instructions for a human to follow that no program is ever going to handle. Even if an attack on the above could be automated, it’s clearly not worth the cost just to get one email address.
Surely it’s time to move on.