January 9, 2012

Wikipedia's Identity Crisis

I assume we've all heard the term "deletionism", but for the uninitiated, it simply describes a point of view that Wikipedia should only consist of high quality articles that would be relevant to an encyclopedia. This is in stark contrast to inclusionism, which contends that anything remotely useful should be retained. These two factions have been at war for quite some time, flaring up when any particularly contentious deletion debate arises. The funny thing is that both sides are right. The catch is that they are not fighting over what should be included in Wikipedia, they are fighting over what Wikipedia is.

The problem here is a classic case of intended usage not matching up with real usage. The crux of the issue is the introduction to Wikipedia, which states "Wikipedia is a free encyclopedia". Consequently, taking everything about Wikipedia literally, the deletionists are in fact, correct. They delete articles that do not belong in an Encyclopedia, of which the ones in question quite often would not. Sometimes they get a little overzealous, but for the most part, they're just trying to ensure Wikipedia stays an encyclopedia.

This is a problem, because Wikipedia isn't used as an encyclopedia. As a culture, we treat Wikipedia as nothing less than a compendium of all human knowledge. We assume you can find anything on Wikipedia, and so long as we stay within the guidelines of an Encyclopedia, we're usually ok. Unfortunately, an encyclopedia is not a repository for the sum of all human knowledge, and thus sometimes fails to provide information that doesn't belong in an Encyclopedia, but would still be useful. Take a look at Wikipedia's definition of an Encyclopedia:
An encyclopedia (also spelled encyclopaedia or encyclopædia) is a type of reference work, a compendium holding a summary of information from either all branches of knowledge or a particular branch of knowledge.
It's a summary of information from branches of knowledge. This is where we start having problems. A person who is not particularly notable is still information, but not a significant part of any branch of knowledge and therefore doesn't belong in any extended summary of that information. However, that information could undoubtedly be useful to at least a few people, which is where the inclusionists are coming from. A compendium of all human knowledge is, well, all human knowledge. The problem is that there is a lot of it. Wikipedia has millions of articles even after being substantially trimmed down. It is clearly designed from the ground up to be an encyclopedia, not a collection of all human knowledge. In contrast, people tend to use Wikipedia as a place to store ALL information they deem even remotely relevant.

What can be done about this? Wikipedia is already suffering from growing pains as it tries to deal with millions of articles. A compendium of all human knowledge would probably have to deal with hundreds of millions of articles. How do you deal with that kind of information overload? Should you rework Wikipedia itself, or introduce something completely new? Do you suck all the Wikipedia articles out of its archives and "fork" the project? Do you start fresh and decide to require people to register before editing articles? Is a giant compendium of all human knowledge even useful? These are the questions we are going to have to answer as the amount of information continues to explode at an exponential rate. Perhaps a small startup will be able to find some answers.