Wikipedia: garbage in, garbage out

Jim Wales, the founder of Wikipedia, describes Wikipedia as "an effort to create and distribute a multi-lingual free encyclopedia of the highest possible quality to every single person on the planet in their own language." The motto is "the free encyclopedia that anyone can edit."

However, Wikipedia has several drawbacks that call into question its legitimacy as a serious service. One of these drawbacks lies in the claim that it is a site that "anyone can edit." The problem with this model is that not everyone can edit and even a smaller portion would edit. For instance, people without computers cannot edit. Business people and executives are generally far too busy to edit. That leaves a small subset of people.

That subset of people is technophiles who are generally white, generally male and generally not experts in what they are editing. In fact, one of the original people involved with Wikipedia has criticized what he calls "anti-elitism" in Wikipedia. What this means is that there is a general distrust for "experts" and a preference for editors who are less knowledgeable. As a friend put it, "they can code Perl so they think they're experts in evolutionary biology."

Wikipedia recognizes this problem and responds with the maxim "out of mediocrity, excellence". The idea that enough mediocrity can eventually produce excellence is novel and interesting, but unconvincing. A better and perhaps more accurate mantra might be "garbage in, garbage out."

Disregarding the cognoscenti in favor of the incognizanti leads to very little verification of facts in Wikipedia. Every article that is created on Wikipedia is generally reviewed for quick compliance with policies (ie, not vandalism, not slanderous) -- essentially to see if it passes the "smell test". If the page looks complete and looks like it contains real information, it stays.

Take, for instance, the case of Climbing Jack which was an April Fool's joke article that remained on Wikipedia for over eight months before being removed. Another more damning example is the case of Erdosville, Nebraska. Since the town does not exist, it would have been a snap to verify that it did not exist. The entry stayed up for over a year, and interestingly enough, a real estate agent advertised being able to sell houses near "world-famous Erdosville."

Wikipedia also has an interesting test for what gets to be including in articles called "verifiability". This concept is summarized by Wikipedia as "the threshold for inclusion is verifiability, not truth". Verifiability can be further amended as something that can be verified over the web. There is a bizarre Internet-centric view of information, namely, that if it isn't on the Internet, it does not exist.

The lack of input validation into Wikipedia leads not only to outright false information but a variety of articles that have no content as well. Wikipedia brags that it has over 1.5 million articles in the English edition (the one used for the rest of this article). Many of those articles, however, are stubs. Stubs are what Wikipedia calls articles that only have a sentence or two of content and act as placeholders for a topic yet to be written. According to Wikipedia's own estimates there are over 765,000 stubs. This estimate could off by about 20 per cent because Wikipedia rounded up in its counts and articles might be listed as a stub in multiple categories. This leaves about 600,000 stubs. That means over two in five articles in Wikipedia are  stubs. That does not include articles that are stubs that have not been tagged by someone. This is in line with other estimates.

Wikipedia also has a policy forbidding the use of original research in Wikipedia. If it hasn’t been published somewhere else, it cannot be used. However, over 180,000 articles are tagged by Wikipedia as not having necessary sources. That is, over 12 per cent of the articles in Wikipedia contain assertions which have not been documented.

Lastly, there are pages that exist solely as "disambiguation" pages. These are pages listing the many alternatives for a certain word. For instance, if you search Wikipedia for "George Bush", you get a page that lists the alternatives of what you might mean. There are about 73,000 disambiguation pages. There are also about 32,000 articles that are nothing but lists of other articles.

Adding the numbers above shows that 58 per cent of articles on Wikipedia have no intellectual merit whatsoever. In addition, one study found that between 1 and 2 per cent of Wikipedia articles have been plagiarized in part or in whole. Wikipedia's own estimates list that only 34 per cent of articles are more than 2 kilobytes (300 words). Articles under that size are likely stubs or articles with no real content. This is all before looking at articles that are on useless or inconsequential subjects such as almost every sex position or piece of sexual slang.

Once an article or piece of information makes it into Wikipedia, the presumption is that the article or item remains. This has lead to a dramatic increase of stubs, for instance, but also of capricious deletion policies. The notion of "notability" is in the eye of the beholder. Hence, there are many non-notable articles that are in Wikipedia because they've "always been there". On the flip side, articles have been deleted simply because someone is upset at the author, subject, or article.

Most article deletion votes take place with 10 or fewer people voting. In December, Wikipedia boasted 17,000 active editors. That means with 1 in 1,700 voting, "consensus" is achieved. Or more accurately, if there is a small group of cranky editors who want to cause problems, they can get their way.

Due to the selection bias of authors and editors in Wikipedia, there is a bizarre sense of priority in what is included and expanded in Wikipedia. The article on Britney Spears is 13 pages long, the same length as the article on Henry the VIII. The rule on notability in Wikipedia suggests that articles included should have relevance to people 100 years from now. It is hard to imagine that Britney Spears will even be remembered in 100 years. King Henry the VIII, on the other hand, did found the Anglican Church and split from Rome.

Pythagoras, the father of numbers, gets only six pages on Wikipedia. That is the same length as the article on Leeroy Jenkins who had a briefly popular five minute web video on World of Warcraft -- and Leeroy Jenkins isn't even his real name. Pythagoras only gave us the Pythagorean Theorem. Bubb Rubb even gets an article and all he did was get interviewed by a local news show and shout "whoooo!" into the camera.

Almost every game created for the Nintendo Entertainment System (the first one from 1985) has an article on Wikipedia. Runescape, a marginally popular online game, has 45 pages. Pokémon has over 226 pages dedicated to every aspect of it. By way of contrast, the Book of Genesis entry on Wikipedia takes up only 64 pages. Something is profoundly absurd about an encyclopaedia entry on a video game being over three times longer than the entry for the Book of Genesis.

Wikipedia notability constitutes whatever is popular, if only for one day, to technophiles. For instance, there are 174 articles on Battlestar Galactica. This includes articles not only on each of the actors, but articles on each of the characters, technology, and even the religion portrayed in the show. Granted the show is popular and very entertaining (even to me), but it is not likely to be remembered 10 years from now.

Another problem with Wikipedia is that it will not censor for the protection of minors. One could argue for the inclusion of a relevant image in an article on genital warts. However, it is purely pornographic to include an image of full-frontal nudity in an article on indecent exposure. There are galleries of pornographic images that are not used in any article. School children use Wikipedia for research and there are caches of porn contained in it. Wikipedia will do nothing about the issue for fear of "censorship."

Rules do exist on Wikipedia to provide the appearance of objective decision-making. However, one of the rules that Wikipedia has is that there are no rules. If someone with enough clout decides that a certain action will "improve Wikipedia", all other rules are jettisoned. For instance, creating or editing a biography on oneself is against the rules. Nevertheless, Jim Wales edited his own biography.

Wikipedia has gained tremendous popularity in the few years of its existence. In such time, the weaknesses of the model have become apparent. Some of these stem from the model they've chosen for the system, but no small part of the problem is the bias of its editors. There are ludicrous emphases on some subjects of no consequence, a startling number of articles with no content whatsoever, and a policy system that can be overruled at the whim of an administrator. Many teachers and librarians have written against Wikipedia's usefulness as a primary reference. The fact is that Wikipedia is untrustworthy as anything other than a quick place to look to find other sites with reliable information.

John Bambenek is a columnist and freelance writer who blogs at Part-Time Pundit. His biography was deleted from Wikipedia by its editors, but at one stage it falsely listed him as a child sex offender for over an hour and a half.


Join Mercator today for free and get our latest news and analysis

Buck internet censorship and get the news you may not get anywhere else, delivered right to your inbox. It's free and your info is safe with us, we will never share or sell your personal data.

Be the first to comment

Please check your e-mail for a link to activate your account.