Acorns

Marcel's blog

Captcha the Dog Exploit

Care2 has an interest in animal-themed captchas, so I evaluated Captcha the Dog.  I think I have found a vulnerability, at least in the image recognition component, which I believe is the meat of the puzzle.

I noticed they show a relatively small set of images.  The URLs are different, so you can't tell two images are the same just by looking at the HTML.  In fact, even if you download two images that look the same, their file contents differ slightly.  Most of the time, you can get around that just by converting to a pixmap format; I think the difference is in a JPEG comment (or something not quite so valid -- the conversion warns "N extraneous bytes before marker 0xee").  Occasionally, even the pixmaps differ.  In that case, a simple similarity algorithm does the trick.  The comparison algorithm I used claims 0.00-0.13% difference for identical-looking pictures, versus 20-40% difference for completely different-looking pictures.  (I only tried a handful of each.)

So I believe an attacker could easily download several sample images and classify them by hand.  Then the attacker could write a program using readily available tools to download each image in a captcha and compare it to the pre-classified samples.  If it's a dog image, the program would invoke the DogNowCat function as if a person clicked the image.  If no dogs remain, the captcha is defeated and the form can successfully be submitted.

The captcha could be improved by making the similar-looking images more different.  Some ways that would elude my image comparison algorithm:

  • invert colors in the image
  • add a substantial amount of random speckling
  • offset the image by a random amount

But I think the algorithm could be enchanced to cope with each of these.

The accuracy of Google's more sophisticated facial recognition casts a shadow on the future of image recognition puzzles as an effective captcha strategy, at least if we want to keep them easy for humans.

Post a comment

Name or OpenID (required)


(lesstile enabled - surround code blocks with ---)