January 28, 2005
Trees vs. Leaves: Tagging may be shaking the leaves off of taxonomic trees, affecting not only how we organize ideas and information but how we think about organization itself.
Bridge Blogging: A new effort tries to break through the national boundaries implicit in the blogosphere.
Links: Some funnish stuff.
Bogus Contest: Wikipedia topics.
Another issue, another apology
Sorry it took me so long to do publish an issue. My blog is sapping my 'zine.
It's a JOHO World After All
I gave the after-dinner speech at a conference on Blogs, Journalism and Credibility put on by the Berkman Center, the Shorenstein Center and the American Library Association. You can download the mp3 here or listen to the Real Audio stream here. You can read about why some people hated the conference here; it stirred up a boatload of hostility before it happened because it struck some people as elitist and/or too homogeneous.
Also, I gave a talk to the Library of Congress that was broadcast by C-SPAN. It's on taxonomies and stuff. There's an MP3 of the speech here and of the speech and Q&A here. There's video on the C-SPAN site.
Trees vs. Leaves
[Note: I'm writing this as I'm also working on the February issue of Esther Dyson's Release 1.0, which will be on taxonomies and tags. Some of the ideas in this will make their way into that issue. In fact, you can take this issue of JOHO as me thinking out loud about what to say in that article. The Release 1.0 article, which will be around 12,000 words, focuses on interviews with vendors and customers.]
There's a new way of classifying the things in our world and it matters beyond the efficiencies it will bring when we're trying to find stuff.
What's going on
Del.icio.us kicked "tagging" into gear by giving us a reason to tag stuff. It's a bookmarking site: If you come across a page you on the Web that you want to remember, you post the URL to your personal page at del.icio.us. On the way, you tag it with a word or two that will help you find it among the mass of bookmarks you accumulate on your del.icio.us page. The kicker is that everyone else can see not only what you've bookmarked but all the bookmarks that share a particular tag. You can even subscribe to a tag as an RSS feed. For example, I subscribe to the tag "taxonomy," so every time I go into my blog aggregator, I see a list of new pages to which people have applied that tag. You can also see tagging at work at Flickr, a photo post-and-share site that lets you tag your photos or (with their permission) your friends' photos. I subscribe to the "Iraq" tag and see some amazing pictures.
How do you know what tags to use? You don't. You make them up. Popular tags gain momentum: If you want your photo of the beach to be found, and if 90,000 people tag beach photos as "beach" but only 100 type"shore," you will probably go with the popular one. The resulting tag sets have been called "folksonomies" (a play on "taxonomies") because they are bottom-up and self-organizing ... which has its strengths and weaknesses.
Technorati, a site that indexes 6.3 million blogs, has pushed the tagging insurgency one step further by starting to index tags from del.icio.us and flickr, as well as counting blog categories as tags. If you search on a tag at Technorati, you get a page that displays the appropriate Flickr photos, del.icio.us URLs, and blog posts.
Those are just the beginnings.
Trees and piles of leaves
Folksonomies are different in important ways from top-down, hierarchical taxonomies — the shape we've assumed knowledge itself takes.
The old way gets some experts together who create a nested tree of concepts into which everything in a particular domain can be slotted. Think of the Dewey Decimal System. Think of the Tree of Life. The new way invites users of information to add a word or three to the objects they want to find again.
The old way provides the vocabulary we are to use. The new way lets us use our own words.
The old way puts the control of the classification system in that hands of the owners of information classifying it. The new way gives control to the users of information.
The old way creates a tree. The new rakes leaves together.
The old way knows if you search for "trucks" that trucks hang from the "vehicles" branch, so it can also show you SUVs. And, if you search for "vehicles," it knows to show you trucks, SUVs, and tanks. The new way does not initially know that trucks and SUVs are related, but it can link trucks to categories — such as "monster truck rallies" — on branches way on the other side of the tree.
This is not an either-or. The old way — trees — make sense in controlled environments where ambiguity is dangerous and where thoroughness counts. Trees make less sense in the uncontrolled, connected world that cherishes ambiguity.
We are so at the beginning of the insurgency of leaves that we can't tell which problems will be real and how we will solve them or skirt around them.
The most obvious problems have to do with the fact that what works at the beginning of the adoption curve may not work as the curve's pitch increases. For example, del.icio.us has about 45,000 users now. As the numbers increase, both sides of the dialectic are going to become problematic: We'll have too many items clustered under the word "ocean" to be able to browse through, but we also won't have enough items clustered under "ocean" because some people will tag it as "sea," "beach," or "la mer." We'll whine about the "ocean" cluster being too big and not big enough. Damn people.
There are technological fixes that are promising, including doing statistical analyses of how individuals and social groups cluster tags and of the multiple tags attached to any single item. Maybe we'll be able to figure out not only that "sea" and "ocean" are functional synonyms but perhaps that oceans are types of bodies of water. We may be on our way to creating a trans-application thesaurus that each application helps sharpen.
Such a thesaurus would raise another dialectical problem. On the one hand, we want to be able to share tags with everyone. On the other, we want our tags to reflect the way we think, and we think differently about the same things. Will a thesaurus be a probabilistic, multi-lingual babelfish that enables us to "Tag local but find global"? Or will it turn into an instrument of metadata colonialism? Will it be owned? And, of course, how will it be gamed by porno-spammers?
Whatever happens, the development of this type of thesaurus is just one more step up the metadata ladder. Tag sets are going to become objects that social groups work on and share. The relationships among tags won't remain flat and dumb; we'll start recording hierarchical relationships, synonyms, opposites, place names, see-also's, was-is-will-be's; There's value in those constellations of tag sets. Many will be shared. Some will be offered for sale.
Noticing how social groups tag stuff will help disambiguate collections of tagged items because members of social groups tend to think and talk about things more alike than do members of other groups. At least sometimes. But social groups will also form around tags: If I discover that you tag the same things the same way as I do, then we have lots in common. These social groups will begin first as a way to share stuff. They'll get to conversation when what gets tagged is either spectacular or pushes against the limits of the group's self-understanding: "Dude, it's an amazing picture of rotting fruit, but why did you label it 'love'?"
Tags, labels and knowledge
You label a jar of preserves "Strawberry - Aug. 2005" so you can tell what's in it and whether the green stuff on top is supposed to be there. At Flickr, you tag a digital photo of your jar of preserves "strawberry jam" so other people can find it. The label has a context: the thing that it's attached to. The tag's context is invisible and detached: It's how you think other people are going to search for it. (As Joshua Schachter, creator of del.icio.us, says, tagging is the inverse of searching.)
So, we're creating this context-free realm of free-floating metadata, like word magnets on a refrigerator door, that we will paw through and assemble, and, most important, do things we haven't yet thought of.
The fact that we are inventing this way of classifying is important. It announces that we are skeptical at a whole new level: Not just about the content of knowledge but about how it's divvied up in the first place.
This explicitly pries yet another layer off the real and pulls it into the human, for in a tagged world, it's hard to maintain that topics exist independent of us. Or disciplines. Instead, we cluster our world around our interests. New interest? Shuffle and deal again.
The project of knowledge goes from filling up containers with information to making everything public by tagging it and throwing it into the leaf pile. We're doing that together, without waiting for a plan or permission. Then we're rolling around in the leaves.
This is a knowledge economy of wild excess. It would make no sense if we were still scratching for information under rocks.
We are meaning our world together. We can't do it if we have to do it perfectly or even well. It's better just to do it.
We can sort it out later.
Shelley "BurningBird" Powers has just posted an excellent round-up and analysis of recent bloggery about tags. She is less enthusiastic than I am, which is always a good thing. I blogged a response to it.
My friends Ethan Zuckerman and Rebecca Mackinnon, both at the Berkman Center, have, with others, started an initiative called GlobalVoices that aims at helping the blogosphere break through its natural tendency to cluster into groups that are too easily alike. GlobalVoices asks: What can we do to get the rest of the world's voices heard?
That question spawned an international track at a recent Berkman conference that brought together a few dozen bloggers from around the world. These are "bridge bloggers" — bloggers who build bridges to other cultures.
There is, course a blog, and a wiki page. Heck, there's even a manifesto. Feel free to jump in.
(These are all from my weblog...)
Here's a delightful page at the Wikipedia. And here's an animation from John Udell showing how it developed.
The Poynter Institute has put together a good paper on why transparency is a good thing.
Frances Moore Lappé, author of Diet for a Small Planet — a book that influenced my wife and me waaaay back when — has published an essay that out-Lakoffs Lakoff.
This chess game shows you what the computer is contemplating. Very cool. And from Germany comes this nicely-done game that pits you against the computer or another human, as you each try to take over the world by doing Google queries that turn up documents localized in various parts of the world.
Modern Drunkard claims to be mainly serious with the goal of returning "drinking to the glorious Rat Pack/Jackie Gleason Era."
Terry Heaton interviews Ed Cone about the newspaper-sponsored local blogging community.
The international version of OhMyNews has a terrific interview with Dan Gillmor about his plans and the future of news. And Rebecca Mackinnon, ex-bureau chief in China for CNN (and BridgeBlogger), has written an excellent article on what's wrong with CNN.
My daughter and I have made a short movie (under a minute) called RingTone.
This video will tell you how to fold a shirt. Yeah, I know. But you'll like it. Really.
Bogus contest: Best Wikipedia page ever
At the beginning of the Links section above, I linked to the Heavy Metal Umlaut page . What are some other great pages in Wikipedia that would never have made it into the Encyclopedia Britannica?
JOHO is a free, independent newsletter written and produced by David Weinberger. If you write him with corrections or criticisms, it will probably turn out to have been your fault.
To unsubscribe, send an email to email@example.com with "unsubscribe" in the subject line. If you have more than one email address, you must send the unsubscribe request from the email address you want unsubscribed. In case of difficulty, let me know: firstname.lastname@example.org
There's more information about subscribing, changing your address, etc., at www.hyperorg.com/forms/adminhome.html. In case of confusion, you can always send mail to me at email@example.com. There is no need for harshness or recriminations. Sometimes things just don't work out between people. .
Dr. Weinberger is represented by a fiercely aggressive legal team who responds to any provocation with massive litigatory procedures. This notice constitutes fair warning.
Any email sent to JOHO may be published in JOHO and snarkily commented on unless the email explicitly states that it's not for publication.
The Journal of the Hyperlinked Organization is a publication of Evident Marketing, Inc. "The Hyperlinked Organization" is trademarked by Open Text Corp. For information about trademarks owned by Evident Marketing, Inc., please see our Preemptive Trademarks™™ page at http://www.hyperorg.com/misc/trademarks.html
This work is licensed under a Creative Commons License.