The use of checks on websites before comments, forum posts, or registrations, are allowed to happen are commonplace with probably the best known being the CAPTCHA test. CAPTCHAs present users with characters that are obscured in some way and presented as an image making it very difficult for anything other than a human to decipher.
Those tests are quite effective, but one company called reCAPTCHA had a better idea: use the CAPTCHA tests to serve another purpose by helping to digitize books. When you solve a reCAPTCHA you are actually turning a reference from one of those digital book projects into text. This is necessary because the optical character recognition software used to scan those books can’t always figure out what some of the words are, especially in older books, or when copies exist that have some damage.
reCAPTCHA works on a best guess basis. The computer isn’t sure what the word means, so it can’t actually check to see if you got it right in the CAPTCHA test. So what reCAPTCHA does is pair characters it does know with ones it isn’t sure about. If a user gets the characters it does know right then the system can also be confident the characters it has trouble with were also guessed correctly. Repeat that process for the same set of characters with multiple users and the probability that the characters have now been correctly identified goes up significantly.
Now the use of reCAPTCHA is set to expand as Google has announced it has acquired the company. This means we’ll be seeing the use of reCAPTCHAs spread across the Google services we all use everyday significantly increasing the number of reCAPTCHAs performed.
Google also intends to integrate the technology for use on its own Google Books digitization project and Google News Archive Search.
Those who use reCAPTCHA on their own websites or to help combat e-mail spam need not worry as the existing reCAPTCHA service will continue to function.
Read more at The Official Google Blog

