December 29, 2005

Receive email issues. Free!

 

 

Contents

Why the media can't get Wikipedia right: Stuck in its old model, the media get the story backwards.
Are leaves mulch?: Peter Morville's criticism of folksonomies, et al.
Cool Tool : Power scanning!
What I'm playing: Murderous rivolity rules.

 

Merry whatever

To each a greeting as appropriate. And to the atheists, may you enjoy the gift that has no giver.

To all of us: A healthy and peaceful new year.

dividing line
Why the media can't get Wikipedia right

Anti-Executive Summary
Things this piece does not say

  1. Wikipedia is always right
  2. Wikipedia will asymptotically achieve a point of total rightness
  3. Wikipedia is the only source anyone should consult
  4. Wikipedia is impervious to criticism
  5. Wikipedia is better than science, sex and scientific sex
  6. Wikipedia is totally new and there's never been anything like it
  7. Anyone who criticizes Wikipedia is a doody head
  8. Jimmy Wales is G-d.

When the mainstream media addressed the John Seigenthaler Sr. affair — he's the respected journalist who wrote an op-ed in USAToday complaining that slanderously wrong information about him was in Wikipedia for four months — the subtext couldn't be clearer: The media were implicitly contrasting Wikipedia's credibility to their own. Ironically, the some of the media got the story fundamentally wrong, in tone and sometimes in substance.

Most media reports presented the narrative line of the story roughly as follows:

A person of indisputable honor was smeared in Wikipedia. Faced with incontrovertible evidence of its failings, the mainstream media shamed Wikipedia into reluctantly becoming more like them. See, Wikipedia was unreliable all along, just like we said! We're the grownups, and now we're making Wikipedia grow up.

There were lots of little errors of tone. For example, Robert Lever, writing for the Agençe France-Presse, said:

In an unusual bit of self-criticism, Wikipedia notes on its site that some complain about "a perceived lack of reliability, comprehensiveness, and authority" in the encyclopaedia.

"Unusual"? Wikipedia has been a continuous state of self-criticism that newspapers would do well to emulate. It has discussion pages for every article. It has handled inaccuracies not defensively but with the humble understanding that of course Wikipedia articles will have mistakes, so let's get on with the unending task of improving them. Wikipedia's ambitions are immodest, but Wikipedia is not.

And Daniel Terdiman wrote for C-NET:

The article stayed on Wikipedia — the free, open-access encyclopedia — for four months before Seigenthaler finally got the service's founder, Jimmy Wales, to agree to take it down.

"Finally"? Sounds like Jimmy Wikipedia Wales was resistant? Nah. I asked Jimmy about this. He was contacted by Seigenthaler once. Jimmy immediately removed the previous versions of the article so people couldn't come upon it by accident. Previous versions are not indexed by the search engines, but, Jimmy said, "We do that fairly often as a courtesy to people, if there's something disparaging to people in the article." Added Jimmy, Seigenthaler "didn't request that it be deleted. He seemed to be surprised that we were willing to do that."

More serious was the report that Wikipedia was giving up on anonymous editing. For example, The Guardian's piece came with the headline:

Wikipedia bans anonymous contributors to prevent libel

The Guardian then had to run a correction:

Our article below said in error that unregistered users are to be prevented from editing pages when it is only the creation of new pages that will be restricted to registered contributors.

That's right, but the implication, surfaced in the headline and at the end of the article ("Registering users could result in court actions"), was wrong. The point of registering users is not to make it easier to sue errant Wikipedia contributors. It will in fact have the opposite effect. Wikipedia tracks unregistered users' IP addresses — which, with a court order, can usually be traced back to a real-world identity — because it has no other way of telling if a slew of trash articles are coming from a single source. Wikipedia does not track the IP addresses of registered users because their pseudonyms serve the same purpose. So requiring people to log in will make them more anonymous, not less. But it will enable Wikipedia's reputation system to operate more effectively on new entries. And it will cut down on the ~5,000 new entries created every day, of which about 3,500 are obvious junk ("Asdfasdf" is a particularly popular entry) quickly weeded out by the Wikipedians who patrol the site.

Allowing unregistered users to edit existing articles plays into that reputation system. Says Jimmy: "Why do we allow anonymous users to edit existing articles when we know that the flow of edits from anonymous users is worse than from logged-in users? It implicitly self-selects trolls because we see the IP number but not the login name."

Jimmy thinks the the mainstream media misunderstood this story because they have a cognitive problem when it comes to anonymity and accountability:

The thing that people always latch onto is that it has to do with anonymity. But it doesn't have to do with knowing who you are [in the real world] . We care about pseudo-identity, not identity. The fact that a certain user has a persistent pseudo-identity over time allows us to gauge the quality of that user without having any idea of who it really is.

Trying to find out who people really are is a fool's mission on the Net. You could get a credit card ID but that doesn't even tell you very much: This is Bob Smith of Missouri. But if an editor identifies himself as Zocky [the handle of a trusted Wikipedian], I know it's good even though I don't know who Zocky is [in the real world] because I know Zocky's history on the site. I know he's not a spammer, I know he's not making things up — at least within the value of "know" that's relevant in this case.

Jimmy has been all over the news telling people that Wikipedia is not yet as reliable as the Britannica, that students shouldn't cite it, that you should take every article with a grain of salt. (One Wikipedian suggested to me that such a disclaimer ought to be on every page; I agree.) The media are acting as if this is a humbling confession when in fact it's been what Jimmy and Wikipedians have been saying from the first day of this remarkable, and remarkably successful experiment in building an inclusive encyclopedia together.

The media literally can't hear that humility, which reflects accurately the fluid and uneven quality of Wikipedia. The media — amplifying our general cultural assumptions — have come to expect knowledge to be coupled with arrogance1 : If you claim to know X, then you've also been claiming that you're right and those who disagree are wrong. A leather-bound, published encyclopedia trades on this aura of utter rightness (as does a freebie e-newsletter, albeit it to a lesser degree).The media have a cognitive problem with a publisher of knowledge that modestly does not claim perfect reliability, does not back up that claim through a chain of credentialed individuals, and that does not believe the best way to assure the quality of knowledge is by disciplining individuals for their failures. Arrogance, individual heroism, accountability and discipline ... those have been the hallmarks of the institutions that propagate knowledge.2

With Wikipedia, the balance of knowing shifts from the individual to the social process. The solution to a failure of knowledge (as the Seigenthaler entry clearly was) is to fix the social process, while acknowledging that it will never work perfectly. There are still individuals involved, of course, but Wikipedia reputations are made and advanced by being consistent and persistent contributors to the social process. Yes, persistent violators of the social trust can be banished from Wikipedia, but the threat of banishment is not what keeps good contributors contributing well.

Wikipedia is obviously not the first and only instance of this type of knowing in our history. But the balance of heroic individual knowers and persistent, pseudonymous social processes is sufficiently different that the media generally have gone wrong with this story. After all, reporters are held accountable when they get something wrong, so why shouldn't Wikipedians?

A: Because Wikipedia isn't a newspaper and newspaper practices aren't the only way to knowledge.

Is it all good? Nah. But it is.3


1 This is institutional arrogance, not personal, an arrogance with which we are complicit. For example, Simon Winchester, in The Meaning of Everything — a book I like a lot more than his more popular The Professor and the Madman — quotes the preface to the Oxford English Dictionary, written by the unfathomably learned James Murray: "Our attempts lay no claim to perfection; but they represent the most that could be done in the time and with the data at our command." (p. 145)

2 Science is a complex case in this regard. Ever since The Structure of Scientific Revolutions we know that science's method operates within an unstated social context. Wikipedia only has the social context. Wikipedia would be a bad way to do science; science is a bad way to write a neutral encyclopedia article on George Bush.

3 Since I am claiming to correct someone else's report, and since I'm doing so in an article that talks about the arrogance of knowledge, the probability approachs 1.0 that this article is substantially wrong.

dividing line
Are leaves mulch?

I'm very fond of Peter Morville's Ambient Findability, a highly readable exploration of what's going on in the field of information architecture, i.e., how we find stuff, written by a practitioner and thought-leader.

Larry Irons wrote to me recently, however, asking about Peter's jibe about the idea that I've been pushing, that we're moving from trees of knowledge to big piles of leaves. Peter writes archly that that's a good metaphor because the leaves rot and become food for trees (p. 139). Peter then says about folksonomies:

They are an amazing tool for trendspotting and for revealing desire lines. And as personal bookmark tools, they're not bad for keeping found things found. But when it comes to findability their inability to handle equivalence, hierarchy, and other semantic relationships causes them to fail miserably at any significant scale. If forced to choose between the old and new, I'll take the ancient tree of knowledge over the transient leaves of popularity any day.

Peter's certainly right about the current state of leaf-piling. The question is whether we will develop tools to enable it to scale. For example, the major tagging sites (Delicious.com, Flickr, Technorati) already have started clustering tags based on statistical analyses of their usage. That can lead to discovering equivalences — loose at first but getting tighter and tighter. Folksonomies result in controlled vocabularies that users violate at the expense of findability. Hierarchy can be deduced from tag clusters, although they are unlikely to be as neat as traditional ones...but that makes up a deficiency in traditional hierarchies.

From my point of view, the turn to leaf piles absolutely does not rule out traditional ways of organizing information. Rather, it allows multiple ways of organizing. If an "ancient tree" works, it can and will be used as one path through the leaves. In most cases, though, there will be multiple paths, sometimes deliriously many. That seems to me inevitable in almost every case.

I believe we will turn from single trees to piles of leaves because indefinite potential necessarily has more value than any single actualization.

dividing line

Middle World Resources

Cool Tool

I am getting such a kick out of having a printer/scanner/fax machine with an automatic paper feeder. I've been using my new Canon MP780 — about $200 — to convert old papers into new scans. For example, I found a paper from my academic career that I couldn't get published, so, whoosh, I scanned it in and posted it. And one on John Austin that was published in 1984.

I wonder how many pages I'll feed through until it breaks?

What I'm playing

I finished Serious Sam 2, which was tons of fun if you like being overwhelmed by wave after wave of cartoon enemies as much as I do. Now I'm playing Brothers in Arms: Earned in Blood, in which you control a squad while playing in first person perspective as a member of the squad. You succeed by tactically outsmarting (= outflanking) the Germans. It's a little slow for me, and it'd be a lot more fun if it allowed arbitrary save points instead of sending you back to the beginning of the engagement every time you die (= constantly). Why do game designers think we want to replay their levels until we get them right? Sheesh.

And to all, a good night...


Editorial Lint

JOHO is a free, independent newsletter written and produced by David Weinberger. If you write him with corrections or criticisms, it will probably turn out to have been your fault.

To unsubscribe, send an email to [email protected] with "unsubscribe" in the subject line. If you have more than one email address, you must send the unsubscribe request from the email address you want unsubscribed. In case of difficulty, let me know: [email protected]

There's more information about subscribing, changing your address, etc., at www.hyperorg.com/forms/adminhome.html. In case of confusion, you can always send mail to me at [email protected]. There is no need for harshness or recriminations. Sometimes things just don't work out between people. .

Dr. Weinberger is represented by a fiercely aggressive legal team who responds to any provocation with massive litigatory procedures. This notice constitutes fair warning.


The Journal of the Hyperlinked Organization is a publication of Evident Marketing, Inc. "The Hyperlinked Organization" is trademarked by Open Text Corp. For information about trademarks owned by Evident Marketing, Inc., please see our Preemptive Trademarks™™ page at http://www.hyperorg.com/misc/trademarks.html

Creative Commons License
This work is licensed under a Creative Commons License.