joho logo

March 7, 2007

Receive email issues. Free!

 

 

Contents

The abundance of meaning: If too much information is noise, what's too much meaning?
The abundance of worthiness and the new relevancy: When there's an abundance of worthwhile pages on just about any topic, search engines need to evolve. 
Book stuff: (1) Why finishing a book sucks, (2) the new book's site, and (3) the book's word cloud
A (commercial) model of miscellaneousness: BioMed Central embodies many of the current trends.
Why do movies suck?: We don't make that many movies, we invest heavily in them, and yet most of the comedies aren't funny, the suspensers aren't suspenseful, the action ones are incoherently edited. Why is that?
Cool Tool: The O'Reilly Hacks series
What I'm playing: Dreamfall and Devastation Troopers
Bogus Contest: Suggest a Daily Open-Ended Puzzle

 

The book, she is done. Now don't jinx me.

Sorry for taking so long to publish another edition of this newsletter. It's a combination of my adulterous relationship with my blog (it means nothing to me — you're the ones I love) and my general anxiety at having finished Everything Is Miscellaneous.

I turned it in so long ago. My editor at Times Books, Robin Dennis, did a most excellent job making it better. For example, while it started as a book about the journey of a three-legged rabbit out for revenge against an unnaturally lucky gambler, Robin turned it into a business-y book about the history, meaning and effect of the miscellanizing of information. (Being edited by Robin was a great experience. Thanks, Robin.)

Since turning it in, there's been nothing to do but watch soap operas and yell at the neighboring children. Somehow that's managed to suck up all my time.

See the article below about why finishing a book sucks.

It's a Joho World

Jeez, a lot has gone on since the previous issue. It seems to me that I've done a lot of writing, interviews, consulting, speaking, and a whole bunch of yelling about Net political issues  (Net neutrality  rules! Or at least I wish it did.)  all adding up to very busy days that I can't remember even by lunch time. I also still blog too much, and spend happy time at the Harvard Berkman Center for Internet & Society.

I've done a bad job of keeping track of all of this, unfortunately, because, somehow, Small Pieces Loosely Joined has  become the title of my biography.       

Did I mention the blogging?

 

 

dividing line
The abundance of meaning

Every abundance generates its own pitfall. An abundance of wealth can lead to waste, moral corruption and even revolution if it isn't distributed with a modicum of fairness. An abundance of information becomes noise if we can't navigate it. But what about the abundance of meaning we've developed with the arrival of the Web? If too much information becomes noise, what does too much meaning become?

As Bill Clinton did not say, that all depends on what the meaning of meaning is.

It is ironic or perhaps predictable that "meaning" has one of the widest meanings in our language. It goes all the way from the clear and precise definition of a word in a standard dictionary, to the ultimate amorphousness of all amorphousnesses: the Meaning of Life [insert your own "42" joke here]. What something or someone means can be a definition ("I meant 'funny' in the haha sense"), a consequence ("This means war!"), an essence ("The meaning of Christmas"), or an intention ("I didn't mean to squish your hat"). And more. Meaning is all over the map. And then it's also the map.

I want to take "meaning" in one particular way: Meaning as the sense we make of something, as when we put something in context, show how it relates, draw out hidden consequences or roots, or reveal its value. That's not the Meaning of Life, but it's also not "What does the word 'eleemosynary' mean?"

It used to be hard to make sense of things, so we relied on a few people who could do it for us. The evening news broadcast featured the National Dad telling us How It Was. Encyclopedias may not have given us all the facts, but they condensed the big picture into a little picture that was unarguable. Heck, you might as well argue with the encyclopedia! Columnists and magazines with well-recognized slants gave our side its story, whatever that side was. Of course there was lively debate — before the Net we weren't complete idiots — but the debate was either between media stars (Norman Mailer vs. Gore Vidal, William F. Buckley vs. Norman Mailer, Norman Mailer vs. Norman Mailer) or was as local as our living room, classroom or bar. You could only hear distant voices if they were famous because a voice you could hear at a distance was by definition famous.

Now there is an abundance of ways of making sense of things. Blogs give a billion people the possibility of making sense of things in public. News sites such as Digg and Reddit let readers decide which stories are important. The endless traversability of links means we can construct contexts of understanding by clicking, and there are a near-infinity number of paths we can take. Tags are simple statements, context free, of what a page or a photo means to an individual, from which emerges a topology of social meaning.

While each of these help us to make sense of things, they simultaneously throw in our face the fact that there's so much to be made sense of and we don't agree how to do it. The linked structure of the Web brings our differences close. Even if we don't often make the trip, our awareness of endlessly multiple meanings changes meaning's nature.

None of these changes rip up the old rule book. Rather, they render unarguable what we already suspected. They settle meaning's hash. And put together, perhaps they point to a larger shift. For instance:

All these little shifts work toward the big shift we've been heading towards for the past few hundred years: driving a wedge between meaning and the real. We began our Western culture with an assumption that meaning and being are two sides of the same coin: Meaning, like the real, is independent of us, eternal, knowable, and orderly. In fact, the Greeks assumed that to exist, a thing must exist as something: A man, a plan, a canal, Panama. Everything is something. 

Then, over time, meaning and reality began to go their separate ways. Early on in the delaminating of the two, G-d stepped in to hold them together: G-d created things, their meaning, and their order. Over the centuries we watched as the two separated, until Sartre nauseated us by vividly describing a tree that shed every last shred of its meaning.

But that low point in the history of meaning couldn't stand. The moments of existential dread when the world lacks the common decency to assemble itself into trees and rocks are aberrations - the world's own unwatchable Speed-O moment. Sartre's separation of reality and meaning was an intellectual value judgment that decided that meaning was the imposter, while the real was, well, real.

What looked like fatal weaknesses of meaning are emerging as strengths. Yes, meaning — the sense we make of things — is temporary, based on our concerns at the moment, and part of an incoherent and ambiguous jumble of relationships. We make meaning together. Meaning disappears with us just as surely as our fear of the dark does. Yes, the differences surfaced unavoidably on the Web cannot be resolved, will not be resolved. We are stuck talking with one another forever. Even after The Rapture ends time and history, those of us left behind will be arguing with one another about whether it was really The Rapture, about who got taken who should have been left behind, and about who looked like he could stand to lose a little weight when ascending to Heaven naked.

We are stuck with an abundance of meaning.


So, if too much information is noise, what is too much meaning?

Add enough marbles to your marbles bag, and before too long you can't find your favorites. A bag of marbles doesn't get any smarter no matter how many marbles you put in. That's what information is like.

Meaning as sense-making is the opposite of a bag of marbles. Things gain meaning by being connected to other things: It's similar to this, contained in this, derives from that, could be used for that other thing, reminds me of something that reminds me of something else. Every connection adds to the potential for understanding. And what is too much understanding? I don't know, but I sure wish we'd try it for a change.

But it's not all marbles and rapture. Meaning can overwhelm understanding. We can be aware of so many different ways of taking something that we feel powerless to choose among them. Some people take drugs to feel that way. There are  moments of poetry that overflow with meaning. There are times when the multiplicity of ways in which we make sense of our world fills us with despair, and times when it fills us with joy. There is no one way in which we have too much meaning.

So, in a way that seems either inevitable or too clever by half, the answer to the question "What is too much meaning?" depends on what meaning we make of it.

dividing line
The abundance of worthiness and the new relevancy

In 1995, Yahoo was the coolest kid on the block. Jerry Yang and David Filo had started it with the same impulse that kicked off blogging: Let's share the fruits of our browsing with others. They'd found so many sites to like that they had to figure out a way to organize them. Years later, Joshua Schachter faced exactly the same problem and invented del.icio.us (later bought by Yahoo) to enable users to organize a list of sites by tagging them for themselves and others. Jerry and David  in the mid 1990s instead created a browsable taxonomic tree.

Although their initial impulse was to spread the joy of what was on the Web, Yahoo's unexpected success turned it into a gatekeeper. Getting your site selected for inclusion in the Yahoo tree was a big deal. And the selection process was a black box: You could nominate your site, but there was no way to tell why it was selected or rejected.

Nevertheless, that solution worked well  when there were a million pages on the Web and search engines were wimpy. There weren't that many worthwhile pages on, say, bird watching, so you could trust Yahoo to have found a handful of good ones and to have spared you the dozens of crap ones. Sure, Yahoo faced problems as the Web got larger. The pile of pages to be sifted got bigger, and it required more and more employees to clamber through the existing tree to make sure none of its fruit had become withered with age or had gone wormy with spam. (Yeah, work that metaphor, loverboy!)

But the growth of the Web during the late '90s tipped the scale, changing the equation and our expectations. Yahoo initially dealt with the abundance of sites by finding a rough cut of the worthy ones. But now that there's an abundance not just of sites but of worthy sites, identifying a handful of them — what fits on one branch of the tree — seems arbitrary and insufficient. We go from saying, "Thank you, Yahoo, for taking a good but necessarily imperfect cut at finding the sites worth looking at on this topic," to saying, "Whoa, Yahoo and other such filtering entities! Why did you pick these twenty out of the thousand sites worth looking at? Who put you in charge?" In fact, Yahoo has de-emphasized its navigational tree over the past few years. Filtering goes from friendly to capricious when the top of the value pyramid gets big enough to house not just the Pharaoh but everyone the Pharaoh ever saw.

The last time information hit a new level of abundance, we introduced a new criterion into searching. In the 1980s, we had two: Precision measured whether a search engine turned up results irrelevant to the query, and recall measured whether the engine  missed relevant pages. Then, in the '90s, an engine that did well with both precision and recall could still deliver too many results, so relevancy became a third criterion: Did the engine list the most relevant pages first, or did you have to page through 130,000 precisely recalled results to find the one page that answered your question? Search engines got remarkably good remarkably quickly at providing relevant results.

But now that we have an abundance of worthiness, we need another criterion. Perhaps we should take the word popularized by del.icio.us and flickr.com: Interestingness. Interesting pages are the ones that your friends would have emailed you about because they know your tastes, interests and sense of humor. Within the set of search results that are precise, recall-ful and relevant, we want to see the interesting ones first.

Maybe search engines can do this algorithmically, or maybe they have to intersect with social networks  because "I think you'll like this" is as personal a statement as, "Gosh, you've really lost a lot of hair since the last time I saw you." Send your friends a series of emails recommending stupid sites and your friends not only will stop reading your email, they'll start questioning their friendship. Filtering is an intimate act. 

In an age of an abundance of worthiness, when there are 1,000 good pages relevant to your query about bird watching, we need to take the next evolutionary step beyond precision, recall and relevancy.

dividing line
Book stuff

Why finishing a book sucks

I decided to go the traditional publishing route with Everything Is Miscellaneous because when it comes to lifetime ambitions, I'm a traditionalist. Rail as I might about the mainstream media, I would still kill a minor celebrity (please let it be Paris Hilton!) to get published in The New Yorker. Also, and not incidentally, us Volvo-driving, Birkenstock-wearing East Coast liberals have to put the tofu and kelp on the table, you know.

So, given that my book will be repurposing trees, here's why it sucks to finish one:

EverythingIsMiscellaneous.com

I have a beta of the book's web site up at www.everythingIsMiscellaneous.com (or www.EImisc.com, to keep your tunnels from getting all carpal). It's genuinely beta, as you'll see if you go there.

Send me your suggestions. Be gentle.

The Word Cloud

AMAZON ARTICLE BASED BASIC BOOKS CARDS CATALOG CATEGORIES CATEGORY CLASSIFICATIONS COURSE CUSTOMER DAY DEWEY DIFFERENT DIGITAL ELEMENTS EXAMPLE EXPLICIT FACT GOOD GROUP HAMLET HAND HUMAN IDEAS IMPLICIT INFORMATION ITEMS KNOW KNOWLEDGE LABEL LEAVE LEVEL LIBRARY LINE LINK LINNAEUS LIST LONG LOOKING MAKES MAP MEAN METADATA MILLION MISCELLANEOUS NATURE NEED NEW NUMBER OBJECTS ORDER ORGANIZATION PAGE PAPER PARTICULAR PEOPLE PERSON PHOTOS PHYSICAL PLACE PLANETS POINT PRODUCT READ REAL RELATIONSHIPS RIGHT SCIENCE SEARCH SECOND SET SINGLE SITE SOCIAL SORT SPECIES START STORE TAGS TELL THERE'S THINGS THINK THOUGHT TIMES TOPICS TREES TURN TYPE UNDERSTAND USERS VALUE WAYS WEB WIKIPEDIA WORD WORK WORLD YEARS

This is a "word cloud" that expresses the words most commonly used in Everything Is Miscellaneous, excluding common words such as "the." The size of the word indicates its relative frequency. Tag clouds work the same way, except, of course, for tags.

You can create your own word cloud for an online page at SnapShirts.com. But, frankly, the graphic that site generates is ugly. So I wrote my own word clouder (Windows only).  If you're a Visual Basic sort of person, I'll be happy to send you a copy so long as you don't expect it to actually work.

dividing line
A (commercial) model of miscellaneousness

BioMed Central is a commercial publisher of peer-reviewed scientific research that permits open (= free) access to all of its content. In so doing, it happens to exemplify a whole bunch of trends, many of which are associated with "Web 2.0." It is not a voice from the future, describing visions we cannot yet imagine. It's in some ways more valuable than that, for it's an existing business, dealing with the future in practical ways. In it we can see not just where the Web may go, but where it is right now:

BioMed mixes the commercial and non-commercial. As with typical scientific research journals, authors—or, more likely, their sponsoring institutions—pay for the privilege of being included. But, unlike most scientific journals, at BioMed readers don't pay. Why not? Because putting knowledge behind a wall with a slot for dollar bills makes our species stupider. And given the economic differences in the ability to pay, it cuts off too much of the world. So, BioMed's business model incorporates a sense of responsibility to the community, not just to the investors. In fact, the founder, Matthew Cockerill, started it in 1999 after working for Elsevier for two years; Elsevier has traditionally wrung every penny it can from institutions by pricing some subscriptions north of $10,000/year, a fee well beyond what poorer nations can afford. (Elsevier now provides over 1,000 journals for free to developing countries.) Open access is being forced on the scientific publishing industry: Last year, CERN decided to publish the results of the upcoming runs of their new supercollider only in open access journals. As Cockerill put it when I interviewed him a few weeks ago, "CERN is in the position to call the shots."
 
BioMed makes its processes as transparent as possible. Many of its publications and "all the medical journals in the BioMed series" use open peer review, says Cockerill, in which the comments of the reviewers are public and non-anonymous. So are the revisions to the paper. For example, if you wonder why "A population-based study of human immunodeficiency virus in south India reveals major differences from sentinel surveillance-based estimates" is the way it is, you can read the original submission, two reviewer's reports, the resubmission, the reviewer's report, and the third resubmission. What's working at BioMed failed in a trial at Nature, however, which this summer allowed authors of submissions to have their articles posted for public comment. Or, more exactly, Nature's admirable attempt—Nature is the very prototype of an Establishment journal and thus might instead be expected to be far more resistant to change—got some of the variables wrong. It's hard to know which, but it could be something as small as not allowing pseudonyms to something as large as the fact that comments did not affect the editors' decisions. Nature is treating this as a single failed experiment, not as a proof that the traditional peer review process is sacrosanct.
 
BioMed includes more data than is necessary. It urges scientists to publish their raw data so that others can mine it for knowledge. That data may not be "cleaned up" (the term scientists use when they remove inconvenient anomalies) but it is useful even in its less than perfect condition.

BioMed makes the reliability of its information apparent. When you're providing lots of data, it's important to let readers know what the quality of that information is. Since BioMed is peer-reviewed, the articles have an evident imprimatur of quality. But the backing data is understood to be published as-is.

BioMed spreads authority around. Commenters can comment on the articles, disputing and clarifying, supporting and denouncing, focus and extending. Compare this to traditional scientific journals at which the only authority belongs to the anonymous reviewers and the editors, and all we are told is that the Deciders have given it a thumbs up.

BioMed does not believe in moments of time. Because paper publishing requires committing ink to absorbent paper, there is a publication date. Before that, the information is not public. After, is public and pretty much unalterable. But at BioMed, once it's published, it's open for comment.(Pre-pub sites such as arXiv.org are also good examples of this.)

BioMed provides a mix of top-down and bottom-up metadata. It's trying to find the right network of ontologies or controlled vocabularies to make it easier for researchers to find everything that's relevant to their interests; Cockerill says the permitted keywords may be presented to authors through an AJAX-y interface that suggests words as the user types into the form. The site does not provide end-user tagging, but it does "work closely with CiteULike," says Cockerill. "We have some plans to incorporate their tagging within our own journals. We definitely see data coming from users helping to fit articles together." Tagging can of course also be done using an external resource, such as del.icio.us.

BioMed confronts the endless granularity of information. It's trying to get metadata associated not just with articles but with the elements of articles. 

BioMed provides multiple authorities to guide us to worthwhile research. For example, its Faculty of 1000 provides recommendations based on the judgment of over 1,000 "leading scientists." BioMed also does the expected reader-based guidance by publishing a "most viewed" articles list, as well as having a "similar articles" function. (We need tags, dammit! Tags!)

BioMed "intertwingles" its content, spinning a Web of links. It lists other articles that cite a particular BioMed article via Google Scholar, ISI Web of Science and PubMed Central. It lists other articles by the same authors. It creates a permanent record "card" for each article, listing a unique identifier so other articles can reference it unambiguously.

It has lots of feeds. Feeds let users keep up with the latest articles of interest without having to check the BioMed site itself. Feeds distribute information. They are an info diaspora. Many businesses don't like feeds because they keep people from spending time on the site. But such short-sighted sites are short cited.

BioMed experiments. As Clay Shirky points out with his characteristic sharpness in an HBR article recently, the Web is innovation-friendly because it lowers the cost of failure. But institutions that have traditionally relied upon their authority for their value still have difficult embracing experimentation. Not BioMed. For example, BioMed's Biology Direct says it uses a "novel system of peer review" in which the author is responsible for getting members of the editorial board to review her article. Rather than revising the article, the author can instead publish the comments and suggested revisions as an additional section of the manuscript. Interesting experiment.

BioMed does not pretend it's perfect. Cockerill is quick to say (in an email) that at BioMed, "[S]oundness is paramount, and BioMed Central has proven its commitment to scientific accuracy over the last 6 years." But by showing us the sausage being made, we are simultaneously confronted with our species' fallibility and are given additional reason to trust the final result.

It's not just open; it's generous. Openness implies a passive potential. But openness on the Web is active. It encourages others to take content and make more of it. Generosity has built the Web.

Now, I don't single out BioMed because it is the single greatest site ever, or that it's the answer to science's prayers. There are a number of sites doing open access science in really interesting ways. In fact, it seems the BioMed PR person called me because BioMed wants some of the notice that Public Library of Science is getting and deserves. BioMed is not the most open and most free of all the open access endeavors. It's commercial, which may turn out to be what enables open access science to maintain itself or may be a debilitating weakness, or neither, or both. But the fact that BioMed is just one example—an excellent one—of what's going on is exactly the point.

 

dividing line
Why do movies suck?

Why are so few movies any good? I don't mean why do so few make money. I mean, why do most of them suck? Why are they incompetently written, directed and/or edited? After all, not that many are made every year, and they're quite expensive, so one would think there would be enough competent creators that a high percentage would be good of their kind. You could be assured that if it's a romantic comedy, it's a good romantic comedy, and if it's an action movie, it's a good action movie.

But, generally they're not. Romantic comedies are usually predictable and not very funny. Action movies are usually directed and edited so badly that you can't tell who's clubbing whom and which car is outracing which car. Why are the movies themselves so often so bad?

Of course, writing, directing and editing may be harder than I think. Or maybe there are systemic issues. But good TV series—from the US version of The Office to House to Dexter to The Sopranos—are able to turn out high quality products week after week with a variety of writers and directors. And they're made under far worse time pressure. Why are movies so much more inconsistent? Why are movies so often so bad?


On a barely related topic: Having watched more than my share of romantic comedies on planes — and I mean watched, not heard — I want to know why the reconciliation that ends so many of them occurs in public spaces such as airports, train stations, and on the steps of public buildings. These are the very places that inhibit intimacy. Is that why?


Changes in the grammar of movies happen rarely, and often in a burst.

We're ready for another change.

And not just the change in content already happening due to our new ability to potray any imaginable visual image via digital construction. JarJar Binx-ing films is not the Big Change digital allows. Rather, the going digital of movies lets them adopt a  grammar at least as radical as Welles' turning cameras into G-d's eyes.

I don't know what that grammar will be. I'm more or less the opposite of an innovator in film: I mainly go to the movies for the popcorn. But, I bet that (i) the new grammar will give us more information, and (ii) at first it will seem jumbled, gimmicky and intrusive, but then it will be as invisible as a reaction shot or a crane shot. Maybe it'll be a change in the expectation that audiences will sit quietly through a movie. (Judging from our local theaters, that change is well underway.)

 

Middle World Resources

Cool Tool
For the Hyperlinked Organization
 

I always look forward to the latest in the "Hacks" series from O'Reilly Media. Each is packed with tips and tricks to help you get the most out of some piece of technology. Blackberry Hacks, for example, has made my Blackberry far more usable. Some in the series are aimed at developers, and  haven't looked at some of the odder titles, such as Mind Hacks and Baseball Hacks, but the application ones have been well worth the cover price.

What I'm playing

DreamFall: The Longest Journey is a follow-up to the previous Longest Journey adventure, which I played many years ago with one of our children. He now claims to be too old to sit on my lap — it's so cute to see a 16 year old act all grown up! — so I'm playing it myself. And so far, it's been pretty good. On the other hand, at this point The Indigo Prophecy was also good, and then in its final third it became the stupidest game since Pick-Up Sticks: Binge Drinking Edition. DreamFall's graphics are good, the UI is fairly intuitive, and the voice acting is above average. The story is better than in many adventures, although I wouldn't read a book with this plot. The puzzles are generally solvable (and there's an online walkthrough when they're not). There's been a tiny bit of button-mashing fighting, which I totally do not enjoy because I somehow have developed the reflexes of a 56-year-old. But, I've been enjoying the game.

Since writing that, I've finished it and also pretty much completed Devastation Zone Troopers. It's an old-fashioned 3D, third person shooter that feels more arcade-y than Doom-ish, but it's fun. And it comes from a small games publisher - Manifesto - who only charges $20.

dividing line
Bogus contest: DOEP

Over at my blog, for no particular reason, I've been running Daily Open-Ended Puzzles. Except not every day. Some of them are just excuses for me to be dumb in public ("Why don't microwaves have heat settings the way toaster ovens do?"), and others are so vague as to be meaningless ("What would it take to make a bee happy?"...the answer to which is, I believe, "100 million years of evolution"). But, if you like the Bogus contest—and judging from the response over the years, I would say that none of you do—you should check for DOEPs on my blog.

So, instead of just trying to fob you off onto the DOEPS, here's this issue's Bogus Contest: Got any DOEPs I can use?


Editorial Lint

JOHO is a free, independent newsletter written and produced by David Weinberger. If you write him with corrections or criticisms, it will probably turn out to have been your fault.

To unsubscribe, send an email to [email protected] with "unsubscribe" in the subject line. If you have more than one email address, you must send the unsubscribe request from the email address you want unsubscribed. In case of difficulty, let me know: [email protected]

There's more information about subscribing, changing your address, etc., at www.hyperorg.com/forms/adminhome.html. In case of confusion, you can always send mail to me at [email protected]. There is no need for harshness or recriminations. Sometimes things just don't work out between people. .

Dr. Weinberger is represented by a fiercely aggressive legal team who responds to any provocation with massive litigatory procedures. This notice constitutes fair warning.

Any email sent to JOHO may be published in JOHO and snarkily commented on unless the email explicitly states that it's not for publication.
.


The Journal of the Hyperlinked Organization is a publication of Evident Marketing, Inc. "The Hyperlinked Organization" is trademarked by Open Text Corp. For information about trademarks owned by Evident Marketing, Inc., please see our Preemptive Trademarks™™ page at http://www.hyperorg.com/misc/trademarks.html

Creative Commons License
This work is licensed under a Creative Commons License.