Wednesday, April 9, 2008

reCAPTCHA

Here's a great example of synergy on the Internet: You know those CAPTCHA forms that make you read some distorted text and enter it to prove you're human and not a spambot? Well, the School of Computer Science at Carnegie Mellon has come up with a way to put your effort (and that of millions of other users) to beneficent use.

Here's how it works: You're asked to enter two words to pass the test. The image of the first one is generated from a known word and you must enter it correctly to pass. The second one is a mildly-distorted image of a word that could not be interpreted by an OCR scanner. The assumption is that if you got the first word right, you probably got the second one right, too. Word by word, reCAPTCHA is digitizing pre-digital-era books for the Internet Archive's library.

Best of all, you can drop their widget into your website and use it for free. Everybody wins!

2 comments:

Gods own words... said...

nice post,visit me too plz>

Mark Richer said...

I don't know if getting the first word right really means you get the second one right, but I guess if they compile multiple responses for each word the OCR missed, then statistically they can assume if enough people enter the same thing, it would be correct.

So why is it so hard to find your email address anywhere? I was trying to send you a linkedin invite.

http://www.linkedin.com/in/markhricher