December 21, 2009

[2b2k] Struggling with who cares

Yesterday I wrote a little — which will probably turn out to be too much — about the history of fact-finding missions. They’re really quite new, becoming a conspicuous part of international dispute settlement only with the creation of The Hague Convention in 1899. If you do a search on the phrase at the NY Times, you’ll see that there are only intermittent references until the 1920s when suddenly there are lots of them.

It strikes me as odd that we didn’t always have fact-finding missions, which is why I find it interesting. But I don’t think I can convince the reader that it’s interesting, which is why I’ve probably gone on too long about them. (There were obviously previous times when we tried to ascertain facts, but the phrase and the institutionalizing of fact-finding missions or commissions is what’s relatively new.)

Today I’m thinking I really need to shore up the opening section of this first chapter in order to show why the next section (on the history of facts, including fact-finding missions) matters. I think I’ll try to do that by briefly sketching our normal “architecture” of knowledge. For this it’d be good to come up with an easy example. Working on it…


September 6, 2009

Data and metadata: Together again

Terry Jones has an excellent post that lists the problems introduced by maintaining a hard distinction between metadata and data.

Terry cites Everything Is Miscellaneous (thanks, Terry), which argues that the distinction, which is hard-coded in the Age of Databases, becomes a merely functional difference in the Age of Messy Links: Metadata is what you know and data is what you’re looking for. For example, the year of a CD is metadata about the CD if you know the year a Bob Dylan CD came out but you don’t remember the title, and the title can be metadata if you know the title but want to find the year. And in both cases, it could all be metadata in your search for lyrics.

This is all very squishy and messy because the distinction is, as Terry says, artificial. It comes from thinking about experience as content that gets processed, as if we worked the way computers do. More exactly, it comes from thinking about experience as a set of Experience Atoms that then have to be assembled; metadata are the labels that tell you that Atom A goes into Atom Z. But experience is far more like language than like particle physics or Ikea assembly instructions. And that’s for a very good reason: linguistic creatures’ experience cannot be understood apart from language. Language doesn’t neatly separate into content and meta-content. It all comes together and it’s all intertwingled. Language is so very non-atomic that it makes atoms realize how lonely they’ve been.

That doesn’t mean that computer software that separates metadata from data is useless. Lord knows I love a good database. But it also means that computer software that can treat anything as metadata depending on what we’re trying to do opens up some interesting possibilities…

August 25, 2009

Wikipedia’s tactical change mistaken for strategic

At the English language version of Wikipedia now, changes to articles about living people won’t be posted until a Wikipedian has reviewed it. Those articles are now moderated. (See Slashdot for details and discussion.)

I am surprised by the media being surprised by this. Wikipedia has a complex set of rules, processes, and roles in place in order to help it achieve its goal of becoming a great encyclopedia. (See Andrew Lih’s The Wikipedia Revolution‘, and How Wikipedia Works by Phoebe Ayers, Charles Matthews, and Ben Yatesfor book-length explanations.) This new change, which seems to me to be a reasonable approach worth a try, is just one more process, not a signal that Wikipedia has failed in its original intent to be completely open and democratic. In effect, edits to this class of articles are simply being reviewed before being posted rather than after.

The new policy is only surprising if you insist on thinking that Wikipedia has failed if it isn’t completely open and free. No, Wikipedia fails if it doesn’t become a great encyclopedia. In my view, Wikipedia has in many of the most important ways succeeded already.

PS: If you think I’ve gotten this wrong, please please let me know, in the comments or at, since I’ll be on KCBS at 2:20pm EDT to be interviewed about this for four minutes.

July 19, 2009

Transparency is the new objectivity

A friend asked me to post an explanation of what I meant when I said at PDF09 that “transparency is the new objectivity.” First, I apologize for the cliché of “x is the new y.” Second, what I meant is that transparency is now fulfilling some of objectivity’s old role in the ecology of knowledge.

Outside of the realm of science, objectivity is discredited these days as anything but an aspiration, and even that aspiration is looking pretty sketchy. The problem with objectivity is that it tries to show what the world looks like from no particular point of view, which is like wondering what something looks like in the dark. Nevertheless, objectivity — even as an unattainable goal — served an important role in how we came to trust information, and in the economics of newspapers in the modern age.

You can see this in newspapers’ early push-back against blogging. We were told that bloggers have agendas, whereas journalists give us objective information. Of course, if you don’t think objectivity is possible, then you think that the claim of objectivity is actually hiding the biases that inevitably are there. That’s what I meant when, during a bloggers press conference at the 2004 Democratic National Convention, I asked Pulitzer-prize winning journalist Walter Mears whom he was supporting for president. He replied (paraphrasing!), “If I tell you, how can you trust what I write?,” to which I replied that if he doesn’t tell us, how can we trust what he blogs?

So, that’s one sense in which transparency is the new objectivity. What we used to believe because we thought the author was objective we now believe because we can see through the author’s writings to the sources and values that brought her to that position. Transparency gives the reader information by which she can undo some of the unintended effects of the ever-present biases. Transparency brings us to reliability the way objectivity used to.

This change is, well, epochal.

Objectivity used be presented as a stopping point for belief: If the source is objective and well-informed, you have sufficient reason to believe. The objectivity of the reporter is a stopping point for reader’s inquiry. That was part of high-end newspapers’ claimed value: You can’t believe what you read in a slanted tabloid, but our news is objective, so your inquiry can come to rest here. Credentialing systems had the same basic rhythm: You can stop your quest once you come to a credentialed authority who says, “I got this. You can believe it.” End of story.

We thought that that was how knowledge works, but it turns out that it’s really just how paper works. Transparency prospers in a linked medium, for you can literally see the connections between the final draft’s claims and the ideas that informed it. Paper, on the other hand, sucks at links. You can look up the footnote, but that’s an expensive, time-consuming activity more likely to result in failure than success. So, during the Age of Paper, we got used to the idea that authority comes in the form of a stop sign: You’ve reached a source whose reliability requires no further inquiry.

In the Age of Links, we still use credentials and rely on authorities. Those are indispensible ways of scaling knowledge, that is, letting us know more than any one of us could authenticate on our own. But, increasingly, credentials and authority work best for vouchsafing commoditized knowledge, the stuff that’s settled and not worth arguing about. At the edges of knowledge — in the analysis and contextualization that journalists nowadays tell us is their real value — we want, need, can have, and expect transparency. Transparency puts within the report itself a way for us to see what assumptions and values may have shaped it, and lets us see the arguments that the report resolved one way and not another. Transparency — the embedded ability to see through the published draft — often gives us more reason to believe a report than the claim of objectivity did.

In fact, transparency subsumes objectivity. Anyone who claims objectivity should be willing to back that assertion up by letting us look at sources, disagreements, and the personal assumptions and values supposedly bracketed out of the report.

Objectivity without transparency increasingly will look like arrogance. And then foolishness. Why should we trust what one person — with the best of intentions — insists is true when we instead could have a web of evidence, ideas, and argument?

June 12, 2009

Newsy is meta-newsy

Newsy, a project in collaboration with Univ. of Missouri’s Journalism School, pulls together a half-dozen media reports on a topic, stringing them together with their own reporter-at-a-desk commentary. The sources include mainstream news and less mainstream news. For example, here’s Newsy’s meta-coverage of China’s new Net blockage:

Newsy is a manual curation and production project. At least during this beta phase, it seems to be doing one or two a day, which means they may have more luck getting their stories embedded elsewhere than in drawing a regular crowd to their own site. In fact, the site has announced a syndication deal with Mediacom to provide stories for mid-Missouri cable tv subscribers. (The project is also probably a Fair Use lawsuit magnet, unfortunately.)

June 10, 2009

[newmedia] Journalism panel

Dan Gillmor is not as pessimistic as many others about the future of journalism. We’re in a fertile period of innovation. But we need better audiences. Passive consumers need to be active readers, and this ought to be part of school curricula, starting in pre-school.

Jim VandeHri from Politico agrees with Dan that we’re going to end up with more and better journalism, although he has no idea what it’s going to look like and he thinks that newspapers are in much worse shape than most acknowledge.

Nick Wrenn from CNN says they use social media like Twitter both to engage the audience and as an early warning system.

David Kirkpatrick of Fortune (who’s writing a book about Facebook) is not so sure it’s a great time to go into journalism because the business model isn’t there. “I’m happy I’m getting out of it.” Yet the “number of kids who want to be journalists is astonishingly high.” He makes a few points. First, if Google gets better at its search, its ads become less relevant and valuable, and he thinks Bing is intended to force Google to get better at searching for that reason. Second, the number of minutes spent on Facebook has gone up hugely; it is uniquely influential as a media platform, both as a place where people create content and distribute others’ content.

Dan agrees that the business models aren’t there, but he’s jealous of his students because they get to invent their jobs and invent what journalism will be. Jim thinks that over time, there will be more organizations (like Politico) that can pay journalists. There will be lots of journalism, but just not dominated by the big papers and broadcasters. It’ll be non-profits, startups, etc. Politico makes money out of ads. Over the next six months, Politico will experiment with charging for some specialized content.

Q: Is it time to put the broadsheet out of its misery?
A: Dan: Print won’t shut down quickly because there’s still a whole lot of cash flow. And if you reset the debt via bankruptcies, there’s still profit to be had.
A: CNN: Newsrooms have to figure out how to deal with the changes. It’s amazing that newspapers still report on yesterday’s news.

Q: Who’s going to pay to gather dull but important information at the local level?
A: Dan: The newspapers aren’t gathering it now. No one is. We are going to lose eat-your-spinach journalism. Back when newspapers sent journalists to the boring meetings, the journalists were deterrents to bad behavior. Maybe we should hire circuit forensic accountants to work with journalists…
A: David: But now every member of the school board can be a broadcaster. So, the role of the community newspaper can be different. I am incredibly optimistic about the future of society in terms of info being distributed. But I’m not optimistic about the future of journalism.

[2b2k] Chapter 4 – inappropriately concrete?

Chapter 3 left readers with a problem without resolution. If facts don’t provide as firm a bedrock as we’d thought, then are we left to believe whatever we want? Is there no hope? [Spoiler: No, we’re not free to believe whatever we want.]

Because Chapter 3 was pretty abstract, I want to be sure to address its question in some concrete ways. So, Chapter 4 opens with a brief scene-setter that says that we all love diversity, but when there’s too much, we can’t get anything done. I’m now at the beginning of a section that will give maybe four general rules for “scoping” diversity so that a group has enough internal difference to be smarter than the smartest individual, but not so much that they can’t get past bickering. I plan on following that with a more abstract section that asks whether the Net is making us more open or closed to other people’s ideas. At the moment, I like the idea of beginning with the concrete and moving to the abstract, in large part because I think the abstract question is pretty much impossible to answer.

I can’t tell yet if the chapter structure is going to work. There is so much to say about this topic. And I have a concern that the reader is not expecting the book to take this turn. But I won’t be able to tell that until I have enough distance on the prior chapters to be able to read them with some degree of freshness.


June 9, 2009

[berkman] Lewis Hyde on the Commons

Lewis Hyde is giving a Berkman talk about the book he’s working on. The book is about the ownership of art and ideas, and argues that they should lie in a cultual commons, rather than be treated as property.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Lewis begins by talk about what a commons is. The term comes from medieval property ideas, and Lewis thinks of commons as a kind of property. He asks the group for a definition of property. Suggestions from the audience: “Exclusive rights.” “Anything I can use and have some degree of control over, not necessarily exclusively.” Lewis says that a 1900 dictionary defines property as that over which one has “rights of action.” Property is a bundle of rights of action. Lewis likes this definition because it includes human actors, Blackstone defines property rights in maximalist terms: the right to exclude the entire universe. Scalia also thinks property is the right to exclude. Lewis thinks the right to exclude is one of the bundle, not the whole thing. This is because, he says, he’s interested in commons. (He notes that in medieval times, “common” could be used as a verb. E.g., “a man may commons in the forest.)

Lewis talks about Hardin’s “The Tragedy of the Commons” essay. In fact, traditionally commons had governance rules to prevent the destruction of the commons’ asset, including the right of exclusion. “Commons were in fact not tragic. They lasted for millennia in Europe. Not tragic because they were rule-governed and stinted.” Why has the phrase “The tragedy of the commons” persisted? In part, because the phrase is catchy. In part because Hardin proposed it during the Cold War and it was taken as showing that common-ism doesn’t work.

There used to be an annual ritual of “beating the bounds,” to keep any gradual encroachment on the commons. “These were convivial affairs.” Lewis wonders if there are ways we can recover this resistance to encroachment.

Applied to the cultural realm, Lewis thinks cultural products are by nature in a commons. In the 18th century you get the idea that we could own poems, novels, etc. Until then, people thought of property as applying only to land. If something is not excludable, there’s no property in it. Many argued in the 18th century that therefore artistic works can’t be property. (Lewis recommends Terry Fisher’s article on philosophies of property. Terry points to four : Labor, moral rights, commercial utilitarianism, and civic utilitarianism.)

The first copyright law was in 1710 (Statute of Anne). By giving authors and publishers rights, it removed the “in perpetuity” of the crown’s monopolistic grants. It also created the public domain by creating a clear limit on the term of ownership: After 14 years, it enters the public domain. It’s as if the commons is the default state, says Lewis.

Jamie Boyle talks about the “second enclosure” in which everything is copyrighted by default, the term is extended. The second enclosure is an enclosure of the mind, says Boyle. Lewis now thinks there might be a third enclosure: The enclosure of wilderness of the mind. Lewis agrees that it makes sense to let the creator of a work, say a novel, get rewarded for it. “I wrote it, so it’s mine.” But, asks Lewis, what does the “I” mean? What is the self? He cites a 12th century Buddhist: “We study the self to forget the self.” To forget the self is to wake up to the world around you. Creativity comes out of self-abnegation. To get to something truly new, you have to a door open to the unknown. We usually think that the outside of owned property is the public domain. But that’s a domesticated sphere, things we are familiar with. There’s a old tradition that during the period of maturation, you have to leave the known world, go away from where instruction is given, and become familiar with your ignorance. (Lewis says he’s drawing on Thoreau.)

He takes an example from Jonathan Zittrain. When the Apple II came out, there was a spurt in sales because the first spreadsheet emerged, something that had not been expected. If you want a generative Internet, you have to be careful about what you lock down. Another example: In the 1980s, San Diego cell biologists patented a sequence of amino acids. They didn’t know its biological purpose. Ten years later, other researchers think that that sequence blocks blood to tumors. The patent owners sued the researchers. The patent gums up the system. Exploratory science goes into the unknown. “To enclose wilderness means giving property rights in areas where we as yet have no understanding what’s happening.” Lewis adds: “This makes no sense.” Lewis would like us to restore the idea that there are things that are unowned.

Emblematic of the third enclosure is silence. John Cage in 1952 came to Harvard to see/hear a completely soundproofed room. But Cage could hear a low rumbling and high whining. The low rumbling is the sound of your blood and the high whining is the sound of your nervous system. Silence for Cage meant not no sound but non-intention. He composed “4 mins and 33 seconds” which is a stretch of silence. The audience hears the ambient noise. In 2002 a rock group called the Planets put in a minute of silence. As a joke/homage, they credited it to Cage. The royalty-collecting societies started to send checks to Cage’s publisher. The publisher sued for copyright infringement on moral rights grounds (i.e., misattribution). They settled. But Cage held a Buddhist-like view of artistic creation. He tried to remove the self. A lot of copyright law assumes the work contains the imprint of the author’s personality. That’s one of the reasons we give a copyright. But those laws can get in the way of our ability to live in the wilderness, i.e., the third enclosure. How do you become a creator in a world in which scientists can patent unknown sequences and silence can be copyrighted?

Q: Maybe part of the problem in defending the commons is that we say we’re defending freedom, not as in free beer. Fighting for free beer is more compelling than fighting for free speech.
A: Beating the bounds was a fun event. So, yes, people have to want to do this.

Q: [me] How do we counter the fairness argument: If I did it, I ought to get the reward. How do we respond to that?
A: It’s hard to do this in political debate because it’s a long argument. I raise the question of the “I”: To what extent is my contribution really from me? With cultural works, you’re working in a vast sea of existing material. What you create is not entirely yours. Even if it becomes popular and useful, it’s other people who made it so. You can also point to the utilitarian consequences: The public interest is advanced by enabling things to enter the public domain.

Q: [jason] You’re making a creativity defense, i.e., that the commons is generative. But, if we take Cage or Thoreau to heart and say that true creativity consists of transcending the self, could we say that that leads to saying all works should be owned, so that you’re forced to create something new?
A: The puzzle is how much you can actually go to the wilderness. You can face it, but there’s no way to escape the world you come out of. Thoreau has The Iliad with him. There’s no way to escape the known. You always work from materials you’ve collected elsewhere.

Q: [ethanz] What’s so bad about private property? You’re hearkening back to a romantic conception that worked for a very small set of people. We’ve got an enormous amount of development vased on increasingly strong enclosure movements. Those movements have given us a great deal of what we love. Despite the first and second enclosures, creativity seems not to have been much hindered. Why should we worry about the third enclosure? Couldn’t we say that you’re attempting to protect and defend something that most of us have not experienced? How do we know that your romantic vision is superior to the world we’re interacting with?
A: I’m not against private property. The question is always where the lines should be drawn. I think we’ve extended the right to exclude too far. Yes, the world is quite creative. But we don’t know what we’re missing. With the enclosing of wilderness, we’re enclosing that which we don’t know about. Researchers are reluctant to do certain kinds of work, for fear of being sued.
Ethan: My diabetes medicine — recombinant DNA — exists because Eli Lilly worked within enclosures. How do we know we would have made the same progress if those enclosures weren’t there?
A: Let’s leave that hanging as a question. It’s a good question. You’re right that the existing dominant system has produced remarkable results.

Q: Michael Heller in The Gridlock Economy goes through the economic models that explain what we lose by locking stuff down. What’s the cultural loss?
A: Lessig and others write books about this… [Tags: ]


Meaning-mining Wikipedia

DBpedia extracts information from Wikipedia, building a database that you can query. This isn’t easy because much of the information in Wikipedia is unstructured. On the other hand, there’s an awful lot that’s structured enough so that an algorithm can reliably deduce the semantic content from the language and the layout. For example, the boxed info on bio pages is pretty standardized, so your algorithm can usually assume that the text that follows “Born: ” is a date and not a place name. As the DBpedia site says:

The DBpedia knowledge base currently describes more than 2.6 million things, including at least 213,000 persons, 328,000 places, 57,000 music albums, 36,000 films, 20,000 companies. The knowledge base consists of 274 million pieces of information (RDF triples). It features labels and short abstracts for these things in 30 different languages; 609,000 links to images and 3,150,000 links to external web pages; 4,878,100 external links into other RDF datasets, 415,000 Wikipedia categories, and 75,000 YAGO categories.

Over time, the site will get better and better at extracting info from Wikipedia. And as it does so, it’s building a generalized corpus of query-able knowledge.

As of now, the means of querying the knowledge requires some familiarity with building database queries. But, the world has accumulated lots of facility with putting front-ends onto databases. DBpedia is working on something differentL accumulating an encyclopedic database, open to all and expressed in the open language of the Semantic Web.

(Via Mirek Sopek.) [Tags: ]


June 8, 2009

Next, he dehydrates water

Rob Matthews has printed out and bound Wikipedia’s featured articles, creating a 5,000 page volume.

In case you were wondering, featured articles are articles that get a gold star from Wikipedia – about one in every 1,140 at the moment, for the English language version.

(If Rob hadn’t copyrighted the excellent photos, they’d be popping up in every third slide deck from now on.)

