Joho the Blog » 2b2k

March 28, 2013

[annotation][2b2k] Paolo Ciccarese on the Domeo annotation platform

Paolo Ciccarese begins by reminding us just how vast the scientific literature is. We can’t possibly read everything we should. But “science is social” so we rely on each other, and build on each other’s work. “Everything we do now is connected.”

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Today’s media do provide links, but not enough. Things are so deeply linked. “How do we keep track of it?” How do we communicate with others so that when they read the same paper they get a little bit of our mental model, and see why we found the article interesting?

Paolo’s project — Domeo [twitter:DomeoTool] — is a web app for “producing, browsing, and sharing manual and semi-automatic (structure and unstructured) annotations, using open standards. Domeo shows you an article and lets you annotate fragments. You can attach a tag or an unstructured comment. The tag can be defined by the user or by a defined ontology. Domeo doesn’t care which ontologies you use, which means you could use it for annotating recipes as well as science articles.

Domeo also enables discussions; it has a threaded msg facility. You can also run text mining and entity recognition systems (Calais, etc.) that automatically annotates the work with those words, which helps with search, understanding, and curation. This too can be a social process. Domeo lets you keep the annotation private or share it with colleagues, groups, communities, or the Web. Also, Domeo can be extended. In one example, it produces information about experiments that can be put into a database where it can be searched and linked up with other experiments and articles. Another example: “hypothesis management” lets readers add metadata to pick out the assertions and the evidence. (It uses RDF) You can visualize the network of knowledge.

It supports open APIs for integrating with other systems., including into the Neuroscience Information Framework and Drupal. “Domeo is a platform.” It aims at supporting rich source, and will add the ability to follow authors and topics, etc., and enabling mashups.

Be the first to comment »

[annotation][2b2k] Neel Smith: Scholarly annotation + Homer

Neel Smith of Holy Cross is talking about the Homer Multitext project, a “long term project to represent the transmission of the Iliad in digital form.”

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

He shows the oldest extant ms of the Iliad, which includes 10th century notes. “The medieval scribes create a wonderful hypermedia” work.

“Scholarly annotation starts with citation.” He says we have a good standard: URNs, which can point to, for example, and ISBN number. His project uses URNs to refer to texts in a FRBR-like hierarchy [works at various levels of abstraction]. These are semantically rich and machine-actionable. You can google URN and get the object. You can put a URN into a URL for direct Web access. You can embed an image into a Web page via its URN [using a service, I believe].

An annotation is an association. In a scholarly notation, it’s associated with a citable entity. [He shows some great examples of the possibilities of cross linking and associating.]

The metadata is expressed as RDF triples. Within the Homer project, they’re inductively building up a schema of the complete graph [network of connections]. For end users, this means you can see everything associated with a particular URN. Building a facsimile browser, for example, becomes straightforward, mainly requiring the application of XSL and CSS to style it.

Another example: Mise en page: automated layout analysis. This in-progress project analyzes the layout of annotation info on the Homeric pages.

1 Comment »

[annotations][2b2k] Rob Sanderson on annotating digitized medieval manuscripts

Rob Sanderson [twitter:@azaroth42] of Los Alamos is talking about annotating Medieval manuscripts.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

He says many Medieval manuscripts are being digitized. The Mellon Foundation is funding many such projects. But these have tended to reinvent the same tech, and have not been designed for interoperability with other projects. So the Digital Medieval Initiative was founded, with a long list of prestigious partners. They thought about what they’d like: distributed, linked data, interoperable, etc. For this they need a shared description format.

The traditional approach is annotate an image of a page. But it can be very difficult to know which images to annotate; he gives as an example a page that has fold-outs. “The naive assuption is that an image equals a page.” But there may be fragments, or only portions of the page have been digitized (e.g., the illuminations), etc. There may be multiple images on a page, revealed by multi-spectral imaging. There may be multiple orientations of the page, etc.

The solution? The canvas paradigm. A canvas is an empty space corresponding to the rectangle (or whatever) of the page. You allow rich resources to be associated with it, and allow users to comment. For this, they use Open Annotation. You can specify a choice of images. You can associate text with an area of the canvas. There are lots of different ways to visualize those comments: overlays, side-by-side, etc.

You can build hybrid pages. For example, and old scan might have a new color scan of its illustrations pointing at it. Or you could have a recorded performance of a piece of music pointing at the musical notation.

In summary, the SharedCanvas model uses open standards (HTML 5, Open Annotation, TEI, etc.) and can be implement distributed across reporsitories, encouraging engagement by domain experts.

Be the first to comment »

[annotation][2b2k] Philip Desenne

I’m at a workshop on annotation at Harvard. Philip Desenne is giving one of the keynotes.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

We’re here to talk about the Web 3.0, Phil says — making the Web more fully semantic.

Phil says that we need to re-write the definition of annotation. We should be talking about hyper-nota: digital media-rich annotations. Annotations are important, he says. Try to imagine social networks with the ratings, stars, comments, etc. Annotations also spawn new scholarship.

The new dew digital annotation paradigm is the gateway to Web 3.0: connecting knowledge through a common semantic language. There are many annotation tools out there. “All are very good in their own media…But none of them share a common model to interoperate.” That’s what we’re going to work on today. “The Open Annotation Framework” is the new digital paradigm. But it’s not a simple model because it’s a complex framework. Phil shows a pyramid: Create / Search / Seek patterns / Analyze / Publish / Share. [Each of these has multiple terms and ideas that I didn't have time to type out.]

Of course we need to abide by open standards. He points to W3C, Open Source and Creative Commons. And annotations need to include multimedia notes. We need to be able to see annotations relating to one another, building networks across the globe. [Knowledge networks FTW!] Hierarchies of meaning allow for richer connections. We can analyze text and other media and connect that metadata. We can look across regional and cultural patterns. We can publish, share and collaborate. All if we have a standard framework.

For this to happeb we beed a standardized referencing system for segments or fragments of a work. We also need to be able to export them into standard formats such as XML TEI.

Lots of work has been done on this: RDF Models and Ontologies, the Open Annotiation Community Group, the Open Annotation Model. “The Open Annotation Model is the common language.”

If we don’t adopt standards for annotation we’ll have disassociated, stagnant info. We’ll dereased innovation research, teaching, and learning knowledge. This is especially an issue when one thinks about MOOCs — a course with 150,000 students creating annotations.

Connective Collective Knowledge has existed for millennia he says. As far back as Aristarchus, marginalia had ymbols to allow pointing to different scrolls in the Library of Alexandria. Where are the connected collective knowledge systems today? Who is networking the commentaries on digital works? “Shouldn’t this be the mission of the 21st century library?”

Harvard has a portal for info about annotations: annotations.harvard.edu

2 Comments »

March 6, 2013

[2b2k] Cliff Lynch on preserving the ever-expanding scholarly record

Cliff Lynch is giving talk this morning to the extended Harvard Library community on information stewardship. Cliff leads the Coalition for Networked Information, a project of the Association of Research Libraries and Educause, that is “concerned with the intelligent uses of information technology and networked information to enhance scholarship and intellectual life.” Cliff is helping the Harvard Library with the formulation of a set of information stewardship principles. Originally he was working with IT and the Harvard Library on principles, services, and initial projects related to digital information management. Given that his draft set of principles are broader than digital asset management, Cliff has been asked to address the larger community (says Mary Lee Kennedy).

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Cliff begins by saying that the principles he’s drafted are for discussion; how they apply to any particular institution is always a policy issue, with resource implications, that needs to be discussed. He says he’ll walk us through these principles, beginning with some concepts that underpin them.

When it comes to information stewardship, “university community” should include grad students whose research materials the university supports and maintains. Undergrads, too, to some extent. The presence of a medical school here also extends and smudges the boundaries.

Cliff then raises the policy question of the relation of the alumni to the university. There are practical reasons to keep the alumni involved, but particularly for grads of the professional schools, access to materials can be crucial.

He says he uses “scholarly record” for human-created things that convey scholarly ideas across time and space: books, journals, audio, web sites, etc. “This is getting more complicated and more diverse as time goes on.” E.g., author’s software can be part of that record. And there is a growing set of data, experimental records, etc., that are becoming part of the scholarly record.

Research libraries need to be concerned about things that support scholarship but are not usually considered part of the historical record. E.g., newspapers, popular novels, movies. These give insight into the scholarly work. There are also datasets that are part of the evidentiary record, e.g., data about the Earth gathered from sensors. “It’s so hard to figure out when enough is enough.” But as more of it goes digital, it requires new strategies for acquisition, curation and access. “What are the analogs of historical newspapers for the 21st century?” he asks. They are likely to be databases from corporations that may merge and die and that have “variable and often haphazard policies about how they maintain those databases.” We need to be thinking about how to ensure that data’s continued availability.

Provision of access: Part of that is being able to discover things. This shouldn’t require knowing which Harvard-specific access mechanism to come to. “We need to take a broad view of access” so that things can be found through the “key discovery mechanisms of the day,” beyond the institution’s. (He namechecks the Digital Public Library of America.)

And access isn’t just for “the relatively low-bandwidth human reader.” [API's, platforms and linked data, etc., I assume.]

Maintaining a record of the scholarly work that the community does is a core mission of the university. So, he says, in his report he’s used the vocabulary of obligation; that is for discussion.

The 5 principles

1. The scholarly output of the community should be captured, preserved, organized, and made accessible. This should include the evidence that underlies that output. E.g., the experimental data that underlies a paper should be preserved. This takes us beyond digital data to things like specimens and cell lines, and requires including museums and other partners. (Congress is beginning to delve into this, Cliff notes, especially with regard to preserving the evidence that enables experiments to be replicated.)

The university is not alone in addressing these needs.

2. A university has the obligation to provide its community with the best possible access to the overall scholarly record. This is something to be done in partnership with research libraries aaround the world. But Harvard has a “leadership role to play.”

Here we need to think about providing alumni with continued access to the scholarly record. We train students and then send them out into the world and cut off their access. “In many cases, they’re just out of luck. There seems to be something really wrong there.”

Beyond the scholarly record, there are issues about providing access to the cultural record and sources. No institution alone can do this. “There’s a rich set of partnerships” to be formed. It used to be easier to get that cultural record by buying it from book jobbers, DVD suppliers, etc. Now it’s data with differing license terms and subscription limitations. A lot out of it’s out on the public Web. “We’re all hoping that the Internet Archive will do a good job,” but most of our institutions of higher learning aren’t contributing to that effort. Some research libraries are creating interesting partnerships with faculty, collecting particular parts of the Web in support of particular research interests. “Those are signposts toward a future where the engagement to collect and preserve the cultural records scholar need is going to get much more complex” and require much more positive outreach by libraries, and much more discussion with the community (and the faculty in particular) about which elements are going to be important to preserve.

“Absolutely the desirable thing is share these collections broadly,” as broadly as possible.

3. “The time has come to recognize that good stewardship means creating digital records of physical objects” in order to preserve them and make them accessible. They should be stored away from the physical objects.

4. A lot goes on here in addition to faculty research. People come through putting on performances, talks, colloquia. “You need a strategy to preserve these and get them out there.”

“The stakes are getting much higher” when it comes to archives. The materials are not just papers and graphs. They include old computers and storage materials, “a microcosm of all of the horrible consumer recording technology of the 20th century,” e.g., 8mm film, Sony Betamax, etc.

We also need to think about what to archive of the classroom. We don’t have to capture every calculus discussion section, but you want to get enough to give a sense of what went on in the courses. The documentation of teaching and learning is undergoing a tremendous change. The new classroom tech and MOOCs are creating lots of data, much of it personally identifiable. “Most institutions have little or no policies around who gets to see it, how long they keep it, what sort of informed consent they need from students.” It’s important data and very sensitive data. Policy and stewardship discussions are need. There are also record management issues.

5. We know that scholarly communication is…being transformed (not as fast as some of us would like รข?? online scientific journals often look like paper versions) by the affordances of digital technology. “Create an ongoing partnership with the community and with other institutions to extend and broaden the way scholarly communication happens. The institutional role is terribly important in this. We need to find the balances between innovation and sustainability.

Q&A

Q: Providing alumni with remote access is expensive. Harvard has about 100,000 living alumni, which includes people who spent one semester here. What sort of obligation does a university have to someone who, for example, spent a single semester here?

A: It’s something to be worked out. You can define alumnus as someone who has gotten a degree. You may ask for a co-payment. At some institutions, active members of the alumni association get some level of access. Also, grads of different schools may get access to different materials. Also, the most expensive items are typically those for which there are a commercial market. For example, professional grade resources for the financial industry probably won’t allow licensing to alumni because it would cannibalize their market. On the other hand, it’s probably not expensive to make JSTOR available to alumni.

Q: [robert darnton] Very helpful. We’re working on all 5 principles at Harvard. But there is a fundamental problem: we have to advance simultaneously on the digital and analog fronts. More printed books are published each year, and the output of the digital increases even faster. The pressures on our budget are enormous. What do you recommend as a strategy? And do you think Harvard has a special responsibility since our library is so much bigger, except for the Library of Congress? Smaller lilbraries can rely on Hathi etc. to acquire works.

A: “Those are really tough questions.” [audience laughs] It’s a large task but a finite one. Calculating how much money would take an institution how far “is a really good opportunity for fund raising.” Put in place measures that talk about the percentage of the collection that’s available, rather than a raw number of images. But, we are in a bad situation: continuing growth of traditional media (e.g., books), enormous expansion of digital resources. “My sense is…that for Harvard to be able to navigate this, it’s going to have to get more interdependent with other research libraries.” It’s ironic, because Harvard has been willing to shoulder enormous responsibility, and so has become a resource for other libraries. “It’s made life easier for a lot of the other research libraries” because they know Harvard will cover around the margins. “I’m afraid you may have to do that a little more for your scholars, and we are going to see more interdependence in the system. It’s unavoidable given the scope of the challenge.” “You need to be able to demonstrate that by becoming more interdependent, you’re getting more back than you’re giving up.” It’s a hard core problem, and “the institutional traditions make the challenge here unique.”

Be the first to comment »

February 23, 2013

[2b2k] Why it’s ok to get your news through people who share your beliefs

I was steeling myself a couple of days ago to say something in a talk that believe but don’t want to: We shouldn’t feel guilty about relying on sources with whom we agree to contextualize breaking news. It’s ok. It’s even rational.

For example, if the Supreme Court hands down a ruling I don’t understand, or the FCC issues a policy that sounds like goobledygook to my ears, I turn to sites whose politics I basically agree with. On the one hand, I know that that’s wrong on echo chamber grounds: I’m getting reconfirmed in beliefs that I instead should be challenging. On the other hand, if I want to understand a new finding in evolutionary biology I’m not going to go to a creationist site, and if I want to understand the implications of a change in Obamacare, I’m not going to go to a Tea Party site. [Hint: I'm a liberal.] Oh, I might go afterwards to see what Those Folks are thinking, but to understand something, I’m going to go first to people with whom I basically agree.

Unfortunately, saying that in my talk meant I’d have to acknowledge that if I can to go to, say, DailyKos for primary contextualization, then it’s fine for right-wingers go to Fox News. Then I was going to have to explain how Fox and DailyKos are not truly equivalent, since Kos acknowledges facts that are unpleasant for their beliefs, and because Kos allows lots and lots of community participation. But that’s a distraction: If it’s ok for me to go to a lefty site to contextualize my news, it’s ok for you to go to your righty site. That feels wrong to me, and not only because I think right sites are wrong.

I finally realized that I’ m using the wrong sort of sites for my example. I do feel queasy about recommending that people get news interpreted for them by going to sites that operate in the broadcast mode. Fox News is like that. So are Slate and Salon, although to a lesser extent because they allow comments and because they present themselves as opinion sites, not news sites. Kos much less so because of the prominence of blogs and community. But I have no bad feelings whatsoever about taking my questions about the news to my social networks.

Because I’m old, much of social networking occurs on mailing lists. Some of the lists are based on topic, and contain people who broadly agree, but who disagree about most of the particulars; that’s what conversations are for. For example, a couple of the lists I’m on this morning are talking about what it would mean if Tom Wheeler [someone give that man a Wikipedia page!] were appointed as Chair of the FCC as seems increasingly likely. Tom comes out of the cable TV industry, which raises suspicions on my side of the swimming pool. So there has been an active set of discussions on my mailing lists among people who know much more than I do. The opinions range from he’s likely to be relatively centrist (although veering to the wrong side, where “wrong” is generally agreed upon by the list) to he’s never once stood up for users or for increasing competition and openness. Along the way, people have pointed out the occasional good point about him, although overall the tenor is negative and depressed.

Now, do I need to hear from the cable and telecoms industry about what a wonderful choice Tom would be? Sure, at some point. I even need to have my more fundamental views challenged. At some point. But not when I’m trying to find out about who this Tom Wheeler guy is. If we take understanding as a tool used for a purpose, it becomes a wildly inefficient tool — a hammer that’s all handle — if we have to go back to first principles in order to understand anything. Understanding is an efficient tool because it’s incremental: Given that I favor a wildly open Internet and given that I favor achieving this via vigorous competition, then what should I make of a Tom Wheeler FCC chairmanship? That’s my question this morning, not whether an wildly open Internet is a good thing and not whether the best way to achieve this is by increasing competition. Those are fine questions for another morning, but if I have to ask those questions every time I hear something about the FCC, then understanding has failed at its job.

So, I don’t feel bad about consulting my social network for help understanding the news.

And now, like the fine print in an offer that’s too good to be true, here are the caveats: My social networks may not be typical. Some types of news need more fundamental challenge than others. Reliance exclusively on social networks for news may put you into an impenetrable filter bubble. I acknowledge the risks, but given the situatedness of understanding, every act of interpretation is risky.

And yet there is something right in what I’m saying. I know this because going to “opposition” sites to understand the meaning of particular FCC appointments would require me to uncertainly translate out of their own unstated assumptions, and sites that try for objectivity don’t have the nuanced conversations enabled by shared, unstated assumptions. So, there is something right in what I’m saying, as well as risk and wrongness.

9 Comments »

February 15, 2013

[2b2] Data, facts, and the comfort of decisions

Just a quick note updating my post yesterday about the musky Tesla-Times affair. [('m in an airport with just a few minutes before boarding.)

Times Man John Broder has posted his step-by-step rebuttal-explanation-apologia of Elon Musk's data-driven accusations that Broder purposefully drove a Tesla S into a full stop. Looked at purely as a drama of argument, it just gets more and more fascinating. But it is of course not merely a drama or an example; reputations of people are at stake, and reputations determine careers and livelihoods.

Broder's overall defense is that he was on the phone with Tesla support at most of the turning points, and followed instructions scrupulously. As a result, just about every dimension of this story is now in play and in question: Were the data accurate or did Broder misremember turning on cruise control? Were the initial conditions accounted for (e.g., different size wheels)? Were the calculations based on that data accurate, or are the Tesla algorithms off when the weather is cold? Does being a first-time driver count as a normal instance? Does being 100% reliant on the judgment of support technicians make a test optimal or atypical? Should Broder have relied on what the instruments in the car said or what Support told him? If a charging pump is in a service area but no one sees it, does it exist?

And then there's the next level. We humans live with this sort of uncertainty — multi-certainty? — all the time. It's mainly what we talk about when given a chance. For most of us, it's idle chatter — you get to rail against the NY Times, I get to write about data and knowledge, and Tesla car owners get to pronounce in high dudgeon. Fun for all. But John Broder's boss is going to have to decide how to respond. It's quite likely that that decision is going to reflect the murky epistemology of the situation. Evidence will be weighed and announced to be probabilistic. Policy guidelines will be consulted. Ultimately the decision is likely to be pegged to a single point of policy, phrased as something like, "In order to maintain the NYT's reputation against even unlikely accusations, we have decided to ..." or "Because our reviewer followed every instruction given him by Tesla..." Or some such; I'm not trying to predict the actual decision, but only that it will prioritize one principle from among dozens of possibilities.

Thus, as is usually the case, the decision will force a false sense of closure. It will pick one principle, and over time, the decision will push an even grosser simplification, for people will remember which way the bit flipped — fired, suspended, backed fully, whatever — but not the principle, not the doubt, not the unredeemable uncertainty. This case will become yet one more example of something simple &mdash masking the fathomless complexity revealed even by a single review of a car.

That complexity is now permanently captured in the web of blue underlined text. We can always revisit it. But, we won't, because the matter was decided, and decisions betray complexity.

[Damn. Wish I had time to re-read this before posting! Forgive typos, thinkos, etc.?]

2 Comments »

February 14, 2013

[2b2k] The public ombudsman (or Facts don’t work the way we want)

I don’t care about expensive electric sports cars, but I’m fascinated by the dustup between Elon Musk and the New York Times.

On Sunday, the Times ran an article by John Broder on driving the Tesla S, an all-electric car made by Musk’s company, Tesla. The article was titled “Stalled Out on Tesla’s Electric Highway,” which captured the point quite concisely.

Musk on Wednesday in a post on the Tesla site contested Broder’s account, and revealed that every car Tesla lends to a reviewer has its telemetry recorders set to 11. Thus, Musk had the data that proved that Broder was driving in a way that could have no conceivable purpose except to make the Tesla S perform below spec: Broder drove faster than he claimed, drove circles in a parking lot for a while, and didn’t recharge the car to full capacity.

Boom! Broder was caught red-handed, and it was data that brung him down. The only two questions left were why did Broder set out to tank the Tesla, and would it take hours or days for him to be fired?

Except…

Rebecca Greenfield at Atlantic Wire took a close look at the data — at least at the charts and maps that express the data — and evaluated how well they support each of Musk’s claims. Overall, not so much. The car’s logs do seem to contradict Broder’s claim to have used cruise control. But the mystery of why Broder drove in circles in a parking lot seems to have a reasonable explanation: he was trying to find exactly where the charging station was in the service center.

But we’re not done. Commenters on the Atlantic piece have both taken it to task and provided some explanatory hypotheses. Greenfield has interpolated some of the more helpful ones, as well as updating her piece with testimony from the tow-truck driver, and more.

But we’re still not done. Margaret Sullivan [twitter:sulliview] , the NYT “public editor” — a new take on what in the 1960s we started calling “ombudspeople” (although actually in the ’60s we called them “ombudsmen”) — has jumped into the fray with a blog post that I admire. She’s acting like a responsible adult by witholding judgment, and she’s acting like a responsible webby adult by talking to us even before all the results are in, acknowledging what she doesn’t know. She’s also been using social media to discuss the topic, and even to try to get Musk to return her calls.

Now, this whole affair is both typical and remarkable:

It’s a confusing mix of assertions and hypotheses, many of which are dependent on what one would like the narrative to be. You’re up for some Big Newspaper Schadenfreude? Then John Broder was out to do dirt to Tesla for some reason your own narrative can supply. You want to believe that old dinosaurs like the NYT are behind the curve in grasping the power of ubiquitous data? Yup, you can do that narrative, too. You think Elon Musk is a thin-skinned capitalist who’s willing to destroy a man’s reputation in order to protect the Tesla brand? Yup. Or substitute “idealist” or “world-saving environmentally-aware genius,” and, yup, you can have that narrative too.

Not all of these narratives are equally supported by the data, of course — assuming you trust the data, which you may not if your narrative is strong enough. Data signals but never captures intention: Was Broder driving around the parking lot to run down the battery or to find a charging station? Nevertheless, the data do tell us how many miles Broder drove (apparently just about the amount that he said) and do nail down (except under the most bizarre conspiracy theories) the actual route. Responsible adults like you and me are going to accept the data and try to form the story that “makes the most sense” around them, a story that likely is going to avoid attributing evil motives to John Broder and evil conspiratorial actions by the NYT.

But the data are not going to settle the hash. In fact, we already have the relevant numbers (er, probably) and yet we’re still arguing. Musk produced the numbers thinking that they’d bring us to accept his account. Greenfield went through those numbers and gave us a different account. The commenters on Greenfield’s post are arguing yet more, sometimes casting new light on what the data mean. We’re not even close to done with this, because it turns out that facts mean less than we’d thought and do a far worse job of settling matters than we’d hoped.

That’s depressing. As always, I am not saying there are no facts, nor that they don’t matter. I’m just reporting empirically that facts don’t settle arguments the way we were told they would. Yet there is something profoundly wonderful and even hopeful about this case that is so typical and so remarkable.

Margaret Sulllivan’s job is difficult in the best of circumstances. But before the Web, it must have been so much more terrifying. She would have been the single point of inquiry as the Times tried to assess a situation in which it has deep, strong vested interests. She would have interviewed Broder and Musk. She would have tried to find someone at the NYT or externally to go over the data Musk supplied. She would have pronounced as fairly as she could. But it would have all been on her. That’s bad not just for the person who occupies that position, it’s a bad way to get at the truth. But it was the best we could do. In fact, most of the purpose of the public editor/ombudsperson position before the Web was simply to reassure us that the Times does not think it’s above reproach.

Now every day we can see just how inadequate any single investigator is for any issue that involves human intentions, especially when money and reputations are at stake. We know this for sure because we can see what an inquiry looks like when it’s done in public and at scale. Of course lots of people who don’t even know that they’re grinding axes say all sorts of mean and stupid things on the Web. But there are also conversations that bring to bear specialized expertise and unusual perspectives, that let us turn the matter over in our hands, hold it up to the light, shake it to hear the peculiar rattle it makes, roll it on the floor to gauge its wobble, sniff at it, and run it through sophisticated equipment perhaps used for other purposes. We do this in public — I applaud Sullivan’s call for Musk to open source the data — and in response to one another.

Our old idea was that the thoroughness of an investigation would lead us to a conclusion. Sadly, it often does not. We are likely to disagree about what went on in Broder’s review, and how well the Tesla S actually performed. But we are smarter in our differences than we ever could be when truth was a lonelier affair. The intelligence isn’t in a single conclusion that we all come to — if only — but in the linked network of views from everywhere.

There is a frustrating beauty in the way that knowledge scales.

7 Comments »

February 12, 2013

[2b2k] Margaret Sullivan on Objectivity

Magaret Sullivan [twitter:Sulliview] is the public editor of the New York Times. She’s giving a lunchtime talk at the Harvard Shorenstein Center [twitter:ShorensteinCtr] . Her topic is: how is social media is changing journalism? She says she’s open to any other topic during the Q&A as well.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Margaret says she’s going to talk about Tom Kent, the standards editor for the Association Press, and Jay Rosen [twitter:jayrosen_nyu] . She begins by saying she respects them both. [Disclosure: Jay is a friend] She cites Tom [which I'm only getting roughly]: At heart, objective journalism sets out to establish the facts, state the range of opinions, and take a first cut at which arguments are the most rigorous. Journalists should show their commitment to balance by keeping their opinions to themselves. Tom wrote a memo to his staff (leaked to Romenesca
) about expressing personal opinions on social networks. [Margaret wrote an excellent column about this a month ago.]

Jay Rosen, she says, thinks that objectivity is an outdated concept. Journalists should tell their readers where they’re coming from so you can judge their output based on that. “The grounds for trust are slowly shifting. The view from nowhere is getting harder to trust, and ‘here’s where I’m coming from’ is become more trustworthy.” [approx] Objectivity is a cop out, says Jay.

Margaret says that these are the two poles, although both are very reasonable people.

Now she’s going to look at two real situations. The NYT Jerusalem bureau chief Jody Rudoren is relatively new. It is one of the most difficult positions. Within a few weeks she had sent some “twitter messages” (NYT won’t allow the word “tweets,” she says, although when I tweeted this, some people disagreed; Alex Jones and Margaret bantered about this, so she was pretty clear about the policy.). She was criticized for who she praised in the tweets, e.g., Peter Beinart. She also linked without comment to a pro-Hezbollah newspaper. The NYT had an editor “work with her” on her social media; that is, she no longer had free access to those media. Margaret notes that many believe “this is against the entire ethos of social media. If you’re going to be on social media, you don’t want a NYT editor sitting next to you.”

The early reporting from Newtown was “pretty bad” across the entire media, she says. In the first few hours, a shooter was named — Ryan Lanza — and a Facebook photo of him was shown. But it was the wrong Ryan Lanza. And then it turned out it was that other Ryan Lanza’s brother. The NYT in its early Web reporting said “according to early Web reports” the shooter was Ryan Lanza. Lots of other wrong information was floated, and got into early Web reports (although generally not into the N YT). “Social media was a double edged sword because it perpetuated these inaccuracies and then worked to correct them.” It often happens that way, she says.

So, where’s the right place to be on the spectrum between Tom and Jay? “It’s no longer possible to be completely faceless. Journalists are on social media. They’re honing their personal brands. Their newspapers are there…They’re trying to use the Web to get their message out, and in that process they’re exposing who they are. Is that a bad thing? Is it a bad thing for us to know what a political reporter’s politics are? I don’t think that question is easily answerable now. I come down a little closer to where Tom Kent is. I think that it makes a lot of sense for hard news reporters … for the White House reporter, I think it makes a lot of sense to keep their politics under wraps. I don’t see how it helps for people to be prejudging and distrusting them because ‘You’re in the tank for so-and-so.’” Phil Corbett, the standards editor for the NYT, rejects the idea there is no impartial journalism. He rejects that it’s a pretense or charade.

Margaret says, “The one thing I’m very sure of is that this business of impartiality and balance should no longer mean” going down the middle in a he-said-she-said. That’s false equivalence. “That’s changing and should change.” There are facts that we fully believe are true. Evolution and Creationism are not equivalents.

Q&A

Q: Alex Jones: It used to be that the NYT wouldn’t let you cite an anonymous negative comment, along the lines of “This or that person sucks.”

A: Everyone agrees doing so is bad, but I still see it from time to time.

Q: Alex Jones: The NYT policy used to be that you must avoid an appearance of conflict of interest. E.g., a reporter’s son was in the Israeli Army. Should that reporter be forbidden from covering Israel?

A: WhenEthan Bronner went to cover Israel, his son wasn’t in the military. But then his son decided to go join up. “It certainly wasn’t ideal.” Should Ethan have been yanked out the moment his son joined? I’m not sure, Margaret says. It’s certainly problematic. I don’t know the answer.

Q: Objectivity doesn’t always draw a clear line. How do you engage with people whose ideas are diametrically opposed to yours?

A: Some issues are extremely difficult and you’re probably not going to come to a meeting of the minds on it. Be respectful. Accept that you’re not going to make much headway.

Q: Wouldn’t transparency fragment the sources? People will only listen to sources that agree.

A: Yes, this further fractures a fractured environment. It’s useful to have some news sources that set out to be in neither camp. The DC bureau chief of the NYT knows a lot about economics. For him to tell us about his views on that is helpful, but it doesn’t help to know who he voted for.

Q: Martin Nisenholz] The NYT audience is smart but it hasn’t lit up the NYT web site. Do you think the NYT should be a place where people can freely offer their opinions/reviews even if they’re biased? E.g., at Yelp you don’t know if the reviewer is the owner, a competitor… How do you feel about this central notion of user ID and the intersection with commentary?

A: I disagree that readers haven’t lit up the web site. The commentary beneath stories is amazing…

Q: I meant in reviews, not hard news…

A: A real ID policy improves the tenor.

Q: How about the snarkiness of twitter?

A: The best way to be mocked on Twitter is to be earnest. It’s a place to be snarky. It’s regrettable. Reporters should be very careful before they hit the “tweet” button. The tone is a problem.

Q: If you want to build a community — and we reporters are constantly pushed to do that — you have to engage your readers. How can you do that without disclosing your stands? We all have opinions, and we share them with a circle we feel safe in. But sometimes those leak. I’d hope that my paper would protect me.

A: I find Twitter to be invaluable. Incredible news source. Great way to get your message out. The best thing for me is not people’s sarcastic comments. It’s the link to a story. It’s “Hey, did you see this?” To me that’s the most useful part. Even though I describe it as snarky, I’ve also found it to be a very supportive place. When you take a stand, as I did on Sunday about the press not holding things back for national security reasons, you can get a lot of support there. You just have to be careful. Use it for th best possible reasons: to disseminate info, rather than to comment sarcastically.

Q: Between Kent and Rosen, I don’t think there is some higher power of morality that decides this. It depends on where you sit and what you own. If you own NYT, you have billions of dollars in good will you’ve built up. Your audience comes to you with a certain expectation. There’s an inherent bias in what they cover, but also expectations about an effort toward objectivity. Social media is a distribution channel, not a place to bear your soul. A foreign correspondent for Time made a late-night blog post. (“I’d put a breathalyzer on keyboards,” he says.) A seasoned reporter said offhandedly that maybe the victim of some tragedy deserved it. This got distributed via social media as Time Mag’s position. Reporters’ tweets should be edited first. The institution has every right to have a policy that constrains what reporters say on social media. But now there are legal cases. Social media has become an inalienable right. In the old days, the WSJ fired a reporter for handing out political leaflets in a subway station. If you’re Jay Rosen and your business is to throw bombs at the institutional media, and to say everything you do is wrong [!], then that’s ok. But if you own a newspaper, you have to stand up for objectivity.

A: I don’t disagree, although I think Jay is a thoughtful person.

Q: I blog on the HuffPo. But at Harvard, blogging is not considered professional. It’s thought of as tossed off…

A: A blog is just a delivery system. It’s not inherently good or bad, slapdash or well-researched. It’s a way to get your message out.

A: [Alex Jones] Actually there’s a fair number of people who blog at Harvard. The Berkman Center, places like that. [Thank you, Alex :)]

Q: How do you think about the evolution of your job as public editor? Are you thinking about how you interact with the readers and the rhythm of how you publish?

A: When I was brought in 5 months ago, they wanted to take it to the new media world. I was very interested in that. The original idea was to get rid of the print column all together. But I wanted to do both. I’ve been doing both. It’s turned into a conversation with readers.

Q: People are deeply convinced of wrong ideas. Goebbels’ diaries show an upside down world in which Churchill is a gangster. How do you know what counts as fact?

A: Some things are just wrong. Paul Ryan was wrong about criticizing Obama for allowing a particular GM plant to close. The plant closed before Obama took office. That’s a correctable. When it’s more complex, we have to hear both sides out.


Then I got to ask the last question, which I asked so clumsily that it practically forced Margaret to respond, “Then you’re locking yourself into a single point of view, and that’s a bad way to become educated.” Ack.

I was trying to ask the same question as the prior one, but to get past the sorts of facts that Margaret noted. I think it’d be helpful to talk about the accuracy of facts (about which there are their own questions, of course) and focus the discussion of objectivity at least one level up the hermeneutic stack. I tried to say that I don’t feel bad about turning to partisan social networks when I need an explanation of the meaning of an event. For my primary understanding I’m going to turn to people with whom I share first principles, just as I’m not going to look to a Creationism site to understand some new paper about evolution. But I put this so poorly that I drew the Echo Chamber rebuke.

What it really comes down to, for me, is the theory of understanding and knowledge that underlies the pursuit of objectivity. Objectivity imagines a world in which we understand things by considering all sides from a fresh, open start. But in fact understanding is far more incremental, far more situated, and far more pragmatic than that. We understand from a point of view and a set of commitments. This isn’t a flaw in understanding. It is what enables understanding.

Nor does this free us from the responsibility to think through our opinions, to sympathetically understand opposing views, and to be open to the possibility that we are wrong. It’s just to say that understanding has a job to do. In most cases, it does that job by absorbing the new into our existing context. There is a time and place for revolution in our understanding. But that’s not the job we need to do as we try to make sense of the world pressing in on us. Reason can’t function in the world the way objectivity would like it to.


I’m glad the NY Times is taking these questions seriously,and Margaret is impressive (and not just because she takes Jay Rosen very seriously). I’m a little surprised that we’re still talking about objectivity, however. I thought that the discussion had usefully broken the concept up into questions of accuracy, balance, and fairness — with “balance” coming into question because of the cowardly he-said-she-said dodges that have become all too common, and that Margaret decries. I’m not sure what the concept of objectivity itself adds to this mix except a set of difficult assumptions.

Be the first to comment »

February 4, 2013

[2b2k] Are all good conversations echo chambers?

Bora Zivkovic, the blog editor at Scientific American, has a great post about bad comment threads. This is a topic that has come up every day this week, which may just be a coincidence, or perhaps is a sign that the Zeitgeist is recognizing that when it talks to itself, it sounds like an idiot.

Bora cites a not-yet-published paper that presents evidence that a nasty, polarized comment thread can cause readers who arrive with no opinion about the paper’s topic to come to highly polarized opinions about it. This is in line with off-line research Cass Sunstein cites that suggests echo chambers increase polarization, except this new research indicates that it increases polarization even on first acquaintance. (Bora considers the echo chamber idea to be busted, citing a prior post that is closely aligned with the sort of arguments I’ve been making, although I am more worried about the effects of homophily — our tendency to hang out with people who agree with us — than he is.)

Much of Bora’s post is a thoughtful yet strongly voiced argument that it is the responsibility of the blog owner to facilitate good discussions by moderating comments. He writes:

So, if I write about a wonderful dinner I had last night, and somewhere in there mention that one of the ingredients was a GMO product, but hey, it was tasty, then a comment blasting GMOs is trolling.

Really? Then why did Bora go out of his way to mention that it was a GMO product? He seems to me to be trolling for a response. Now, I think Bora just picked a bad example in this case, but it does show that the concept of “off-topic” contains a boatload of norms and assumptions. And Bora should be fine with this, since his piece begins by encouraging bloggers to claim their conversation space as their own, rather than treating it as a public space governed by the First Amendment. It’s up to the blogger to do what’s necessary to enable the type of conversations that the blogger wants. All of which I agree with.

Nevertheless, Bora’s particular concept of being on-topic highlights a perpetual problem of conversation and knowledge. He makes a very strong case — nicely argued — for why he nukes climate-change denials from his comment thread. Read his post, but the boiled down version is: (a) These comments are without worth because they do not cite real evidence and most of them are astroturf anyway. (b) They create a polarized environment that has the bad effect of raising unjustified doubts in the minds of readers of the post (as per the research he mentions at the beginning of his post). (c) They prevent conversation from advancing thought because they stall the conversation at first principles. Sounds right to me. And I agree with his subsequent denial of the echo chamber effect as well:

The commenting threads are not a place to showcase the whole spectrum of opinions, no matter how outrageous some of them are, but to educate your readers, and to, in turn, get educated by your readers who always know something you don’t.

But this is why the echo chamber idea is so slippery. Conversation consists of the iteration of small differences upon a vast ground of agreement. A discussion of a scientific topic among readers of Scientific American has value insofar as they can assume that, say, evolution is an established theory, that assertions need to be backed by facts of a certain evidentiary sort (e.g., “God told me” doesn’t count), that some assertions are outside of the scope of discussion (“Evolution is good/evil”), etc. These are criteria of a successful conversation, but they are also the marks of an echo chamber. The good Scientific American conversation that Bora curates looks like an echo chamber to the climate change deniers and the creationists. If one looks only at the structure of the conversation, disregarding all the content and norms, the two conversations are indistinguishable.

But now I have to be really clear about what I’m not saying. I am not saying that there’s no difference between creationists and evolutionary biologists, or that they are equally true. I am not saying that both conversations follow the same rules of evidence. I am certainly not saying that their rules of evidence are equally likely to lead to scientific truths. I am not even saying that Bora needs to throw open the doors of his comments. I’m saying something much more modest than that: To each side, the other’s conversation looks like a bunch of people who are reinforcing one another in their wrong beliefs by repeating those beliefs as if they were obviously right. Even the conversation I deeply believe is furthering our understanding — the evolutionary biologists, if you haven’t guessed where I stand on this issue — has the structure of an echo chamber.

This seems to me to have two implications.

First, it should keep us alert to the issue that Bora’s post tries to resolve. He encourages us to exclude views challenging settled science because including ignorant trolls leads casual visitors to think that the issues discussed are still in play. But climate change denial and creationist sites also want to promote good conversations (by their lights), and thus Bora is apparently recommending that those sites also should exclude those who are challenging the settled beliefs that form the enabling ground of conversation — even though in this case it would mean removing comments from all those science-y folks who keep “trolling” them. It seems to me that this leads to a polarized culture in which the echo chamber problem gets worse. Now, I continue to believe that Bora is basically right in his recommendation. I just am not as happy about it as he seems to be. Perhaps Bora is in practice agreeing with Too Big to Know’s recommendation that we recognize that knowledge is fragmented and is not going to bring us all together.

Second, the fact that we cannot structurally distinguish a good conversation from a bad echo chamber I think indicates that we don’t have a good theory of conversation. The echo chamber fear grows in the space that a theory of conversation should inhabit.

I don’t have a theory of conversation in my hip pocket to give you. But I presume that such a theory would include the notion, evident in Bora’s post, that conversations have aims, and that when a conversation is open to the entire world (a radically new phenomenon…thank you WWW!) those aims should be explicitly stated. Likewise for the norms of the conversation. I’m also pretty sure that conversations are never only about they say they’re about because they are always embedded in complex social environments. And because conversations iterate on differences on a vast ground of similarity, conversations rarely are about changing people’s minds about those grounds. Also, I personally would be suspicious of any theory of conversation that began by viewing conversations as composed fundamentally of messages that are encoded by the sender and decoded by the recipient; that is, I’m not at all convinced that we can get a theory of conversation out of an information-based theory of communication.

But I dunno. I’m confused by this entire topic. Nothing that a good conversation wouldn’t cure.

4 Comments »

« Previous Page | Next Page »