A useful CAPTCHA from reCAPTCHA

January 21, 2008

Topics: Accessibility.

Just wanted to add the comment, since I didn’t specify it explicitly, that I’m not trying to claim that the accessibility of this particularly CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is all that fantastic — it’s pretty good, but there are serious problems. I’m just saying that it’s a neat idea. 😉

In case you don’t already know, “CAPTCHA” is an abbreviation for “Completely Automated Turing Test To Tell Computers and Humans Apart.” From an accessibility perspective, they tend to have significant problems — and I’m not going to try and claim that this one is perfect. However, it is very thoughtfully done, and has a very interesting additional feature which I appreciated.

I ran across this via Stumbleupon. Unusually, rather than finding it because I was busily stumbling around, I actually became aware of it because I was trying to create a new account. The interesting CAPTCHA is called “reCAPTCHA.”

Specifically, the concept behind it (explained thoroughly on the reCAPTCHA site) is to gain value from user input in CAPTCHA texts.

Most spam protection systems are based on nonsense words, random strings of letters, or obscured text. Anything, fundamentally, which might be difficult for a computer to identify.

What the folks at reCAPTCHA observed was that scanning old books provides a wealth of resources in the realm of obscured text which can’t easily be understood by computers. To solve this problem, they pasted together the needs of a CAPTCHA and their scanning process to create a service which helps them identify these unknown texts.

Obviously, there’s an immediate problem: if the computer has already failed to identify the text, how do you test whether a human has read it correctly? Simply speaking, you don’t.

Instead, reCAPTCHA provides two words for the user: one they know, and one they don’t. The known word is the Turing test — the unknown word creates a source for the computer to identify the word they didn’t know.

From reCAPTCHA.com:

About 60 million CAPTCHAs are solved by humans around the world every day. In each case, roughly ten seconds of human time are being spent. Individually, that’s not a lot of time, but in aggregate these little puzzles consume more than 150,000 hours of work each day. What if we could make positive use of this human effort? reCAPTCHA does exactly that by channeling the effort spent solving CAPTCHAs online into “reading” books.


But if a computer can’t read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here’s how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct.

The CAPTCHA itself is delivered via Javascript or iFrame. When Javascript is unavailable, a perfectly usable fallback is provided. reCAPTCHA also provides an audio alternative — which, I’ll confess, I found very difficult. I’d need to see some kind of user test results, however, to really know how difficult the audio version is overall. In general, as CAPTCHA technology goes, this is an admirable project. Not only because they have taken a reasonably conscientious path in preparing the interface, but simply because it’s a very good idea.

It’s unlikely I’ll implement it, I’ll confess. The fact that it’s delivered via an iFrame and the simple nature of a CAPTCHA go against my generally preferences in web development. However, should I be in a situation where I need to implement one — this will certainly be a strong candidate! (And even stronger if they fix their accessibility issues.)

More Information

  • reCAPTCHA.net (sadly, a table-based layout with no DocTYPE.)
  • reCAPTCHA on Tech Crunch
  • Community MX: reCAPTCHA – Simple and Accessible
  • reCAPTCHA Mailhide (Another product from reCAPTCHA)
  • reCAPTCHA WordPress Plugin
  • PHP (Hypertext PreProcessing) Library for reCAPTCHA
  • Other Application and API (Application Programming Interface) resources for reCAPTCHA

Have something to contribute?

« Read my Comment Policy

11 Comments on “A useful CAPTCHA from reCAPTCHA”

  1. Despite the fact that this subject matter can be extremely touchy for most folks, my opinion is that there has to be a middle or widespread floor that we all can uncover. I do enjoy that youve added related and intelligent commentary appropriate right here though. Thank you!

  2. I’m way late in joining this discussion, but oh well. While I like reCAPTCHA overall, I am thoroughly disappointed with how it functions when JavaScript is disabled. I’ve worked very hard to deliver a solid and consistent experience on my site that degrades gracefully for users without JS enabled. Then along comes reCAPTCHA with its “We need to make sure you are a human” and “next time userjavascript [paraphrasing here]”. And the whole “copy this and paste it here and see how it goes” thing is just so very out of line with the overall voice and branding of the site.

  3. In fact, I’d say that there are a lot of accessibility flaws. It still comes in as one of the more accessible CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) options out there — but only by merit of a lot of sucking elsewhere.

    As to their email protection — well, it seems like a lot of trouble to go to when there are plenty of perfectly valid other options. The ellipsis as link text — pretty problematic, accessibility wise. Launching a new window? Well, it’s questionable — but probably the most reasonable choice given the context.

    However, it seems to me that simply using secure contact forms for all email contacts is really the better choice.

  4. Joe, if you get a chance, what do you think of the reCaptcha mailhide? When I tried it out on their site, I had to click on a link in the middle of a word – the text of the link was an ellipsis – and it launched a new window where I had to solve the Captcha. I didn’t think much of it. They heavily promote the accessibility of the reCaptcha, but the mailhide implementation seems seriously flawed.

  5. we tested a number of captchas for our sites (check out Traxtuff for instance), and eventually made a small project out of it, to let you play around with different free captcha classes and libraries.
    hope it helps find one that will fit your site nicely, you can find it at http://www.trycaptcha.com
    we’re still setting it up but it’s already useful…
    personally I really like recaptcha btw,
    for the great idea of getting people to perform a constructive action as part of a mundane task 🙂

  6. It would not work, even using that term in it’s broadest possible sense. Additionally, for either option there are very serious accessibility consequences.

    First of all, it wouldn’t work as described simply because of the brute force attack method of many spam bots: if they encounter 50 submit buttons, they’ll submit ’em all. Now, this is easy to get around: simply make it so that the submission of any button other than the active one invalidates the submission.

    But you still have the accessibility and usability problems:

    With the first choice, you’re seriously disenfranchising users with learning disabilities, dyslexia, or users who aren’t using a browser with a standard display: screen readers, mobile devices, etc.

    In the second choice, you’re again disenfranchising users with learning disabilities, as well as causing problems for users who are color blind.

    And, of course, on a usability front, you’re making the submission process extremely difficult for everybody. Seriously, do you want a contact form with 40 buttons, each a different size and shape? Aesthetically a lot less than appealing, and damned difficult to take in mentally.

  7. Here is the idea: Have six to ten (or even 100) “submit” buttons. Only one works, but it is a different one each time. Above the buttons you get a message (or a distorted image) saying: “To submit, please press the third button from the left in the second row from the top”.
    Less accessibility but similar: Buttons have different colours or sizes or texts and then the message could read “press green button” or even “press the smallest green button in the second row from the bottom that does not have an x in its text”.

    Would that work

  8. Well, it’s certainly a seal of approval on the stability of the code — I’m not sure that Facebook is what I’d consider at the forefront of high quality code, however…

  9. Last time I checked, Facebook were using reCaptcha during their registration process – and that’s a registration process which handles 250,000 new users everyday. That’s a decent seal of approval for any snippet of code!

  10. Unfortunately, this is true with the default (scripted) version. Ironically, the backup version (available without Javascript) is not only accessible via keyboard but provides a very clear and noticeable :focus state to assist you with keyboard navigation.

    This is certainly one reason that it’s clearly not yet up to scratch.

    Thanks, Gez.

  11. Hi Joe,

    The biggest problem with reCAPTCHA is that it is not keyboard accessible (you have to use a mouse to change the type of challenge). That means it’s inaccessible to some assistive technology users, such as screen reader users, and people with mobility impairments that prevent them from using the mouse.