A short note on the fallibility of crowdsourcing: reCAPTCHA and long s
Dec 1, 2012 · 3 minute read · Commentscomputers
Periodically I encounter a reCAPTCHA whose answer is obvious to anyone who has read old texts in English, but is easily misread by anyone who has not. When I enter the correct, intended word, my answer is always rejected as incorrect. This suggests that most people believe the wrong answer. So this is just one example of how crowdsourcing can easily fail.
The letter in the word that looks like an “f” in “Composition” is in fact an “s”. It is not a typo. It is a long s.
Language evolution
An amusing thought occurred to me: thinking of language evolution as crowdsourcing gone “wrong”. (I use the scare quotes because I reject linguistic prescriptivism.)
Whether it’s the evolution of Latin into the Romance languages such as French, Spanish, and Italian, or the evolution of English from pre-Shakespeare to the present, language changes because it is a collective crowdsourced artifact.
For example, it is well known that the word “orange” in English came about because of a confusion starting from France that became ingrained: Spanish still has “naranja”, from the original Sanskrit word that had an initial “n” in it. One could say that we should undo that crowdsourced mistake and call the fruit a “norange” or something, but I think most of us would agree that this mistake is not a big one.
There are grammar “mistakes” in English that are still controversial today, because they reflect crowdsourced language “confusions” that rub some people the wrong way. In my mind, the biggest one in English is the singular they: use of the third person plural to denote a third person singular of indeterminate gender: “they” in place of “he” or “she”, e.g., “that person didn’t do their homework”, instead of “that person didn’t do his homework” (which is what I was taught in school as being the correct construction).
Who’s right? Who’s wrong? Well, language changes. And it’s still happening. Our language, wherever it is we come from, is different from that of our grandparents’. I even knew a European Portuguese woman who told of her grandmother using entire pronouns and verb conjugations that have been obsolete now for some time, and don’t appear in standard textbooks! The European Portuguese second person familiar plural has disappeared, and the second person familiar singular was long ago removed from Brazilian Portuguese.
Conclusion
I am not a disbeliever in crowdsourcing. If I were, I would not be a happy user of Wikipedia! We just have to watch out and be aware of what might be lost or confused in the process. Meanwhile, some kinds of change are hard to classify as being objectively wrong.