Acorns

Marcel's blog

Obfuscating Email Addresses

Email addresses are all over the Web, but unfortunately they have become the target of automated harvesters who collect them for spamming.

This problem is so rampant that authors on the Web now mangle their email addresses: somebody at example dot com. This is easy for a savy reader to decode, but difficult for a harvester looking for "@" ... ".com". Except that this form has become so common that harvesters could now easily look for "at" ... "dot com".

My solution is to do what Yahoo! does when sign up for a new account: require a human to interpret an image as text. Modern web servers allow postprocessing of a web page to programmatially replace text that looks like an email address with an image that contains the rendered text. Even without Yahoo!'s fancy waviness, OCR on an image is orders of magnitude more difficult than finding email addresses in HTML.

My postprocessor generates these images and replaces email addresses with <img> tags on the fly, mapping them to cryptographic hashes so that, if this became common practice, harvesters could not decode the email address from the image tag.

The images are generated by ImageMagick and cached, so high traffic pages need not generate new images every time.

With a little waviness, this approach would be very solid.

Post a comment

Name or OpenID (required)


(lesstile enabled - surround code blocks with ---)