Joho the Blog » facts

February 14, 2013

[2b2k] The public ombudsman (or Facts don’t work the way we want)

I don’t care about expensive electric sports cars, but I’m fascinated by the dustup between Elon Musk and the New York Times.

On Sunday, the Times ran an article by John Broder on driving the Tesla S, an all-electric car made by Musk’s company, Tesla. The article was titled “Stalled Out on Tesla’s Electric Highway,” which captured the point quite concisely.

Musk on Wednesday in a post on the Tesla site contested Broder’s account, and revealed that every car Tesla lends to a reviewer has its telemetry recorders set to 11. Thus, Musk had the data that proved that Broder was driving in a way that could have no conceivable purpose except to make the Tesla S perform below spec: Broder drove faster than he claimed, drove circles in a parking lot for a while, and didn’t recharge the car to full capacity.

Boom! Broder was caught red-handed, and it was data that brung him down. The only two questions left were why did Broder set out to tank the Tesla, and would it take hours or days for him to be fired?

Except…

Rebecca Greenfield at Atlantic Wire took a close look at the data — at least at the charts and maps that express the data — and evaluated how well they support each of Musk’s claims. Overall, not so much. The car’s logs do seem to contradict Broder’s claim to have used cruise control. But the mystery of why Broder drove in circles in a parking lot seems to have a reasonable explanation: he was trying to find exactly where the charging station was in the service center.

But we’re not done. Commenters on the Atlantic piece have both taken it to task and provided some explanatory hypotheses. Greenfield has interpolated some of the more helpful ones, as well as updating her piece with testimony from the tow-truck driver, and more.

But we’re still not done. Margaret Sullivan [twitter:sulliview] , the NYT “public editor” — a new take on what in the 1960s we started calling “ombudspeople” (although actually in the ’60s we called them “ombudsmen”) — has jumped into the fray with a blog post that I admire. She’s acting like a responsible adult by witholding judgment, and she’s acting like a responsible webby adult by talking to us even before all the results are in, acknowledging what she doesn’t know. She’s also been using social media to discuss the topic, and even to try to get Musk to return her calls.

Now, this whole affair is both typical and remarkable:

It’s a confusing mix of assertions and hypotheses, many of which are dependent on what one would like the narrative to be. You’re up for some Big Newspaper Schadenfreude? Then John Broder was out to do dirt to Tesla for some reason your own narrative can supply. You want to believe that old dinosaurs like the NYT are behind the curve in grasping the power of ubiquitous data? Yup, you can do that narrative, too. You think Elon Musk is a thin-skinned capitalist who’s willing to destroy a man’s reputation in order to protect the Tesla brand? Yup. Or substitute “idealist” or “world-saving environmentally-aware genius,” and, yup, you can have that narrative too.

Not all of these narratives are equally supported by the data, of course — assuming you trust the data, which you may not if your narrative is strong enough. Data signals but never captures intention: Was Broder driving around the parking lot to run down the battery or to find a charging station? Nevertheless, the data do tell us how many miles Broder drove (apparently just about the amount that he said) and do nail down (except under the most bizarre conspiracy theories) the actual route. Responsible adults like you and me are going to accept the data and try to form the story that “makes the most sense” around them, a story that likely is going to avoid attributing evil motives to John Broder and evil conspiratorial actions by the NYT.

But the data are not going to settle the hash. In fact, we already have the relevant numbers (er, probably) and yet we’re still arguing. Musk produced the numbers thinking that they’d bring us to accept his account. Greenfield went through those numbers and gave us a different account. The commenters on Greenfield’s post are arguing yet more, sometimes casting new light on what the data mean. We’re not even close to done with this, because it turns out that facts mean less than we’d thought and do a far worse job of settling matters than we’d hoped.

That’s depressing. As always, I am not saying there are no facts, nor that they don’t matter. I’m just reporting empirically that facts don’t settle arguments the way we were told they would. Yet there is something profoundly wonderful and even hopeful about this case that is so typical and so remarkable.

Margaret Sulllivan’s job is difficult in the best of circumstances. But before the Web, it must have been so much more terrifying. She would have been the single point of inquiry as the Times tried to assess a situation in which it has deep, strong vested interests. She would have interviewed Broder and Musk. She would have tried to find someone at the NYT or externally to go over the data Musk supplied. She would have pronounced as fairly as she could. But it would have all been on her. That’s bad not just for the person who occupies that position, it’s a bad way to get at the truth. But it was the best we could do. In fact, most of the purpose of the public editor/ombudsperson position before the Web was simply to reassure us that the Times does not think it’s above reproach.

Now every day we can see just how inadequate any single investigator is for any issue that involves human intentions, especially when money and reputations are at stake. We know this for sure because we can see what an inquiry looks like when it’s done in public and at scale. Of course lots of people who don’t even know that they’re grinding axes say all sorts of mean and stupid things on the Web. But there are also conversations that bring to bear specialized expertise and unusual perspectives, that let us turn the matter over in our hands, hold it up to the light, shake it to hear the peculiar rattle it makes, roll it on the floor to gauge its wobble, sniff at it, and run it through sophisticated equipment perhaps used for other purposes. We do this in public — I applaud Sullivan’s call for Musk to open source the data — and in response to one another.

Our old idea was that the thoroughness of an investigation would lead us to a conclusion. Sadly, it often does not. We are likely to disagree about what went on in Broder’s review, and how well the Tesla S actually performed. But we are smarter in our differences than we ever could be when truth was a lonelier affair. The intelligence isn’t in a single conclusion that we all come to — if only — but in the linked network of views from everywhere.

There is a frustrating beauty in the way that knowledge scales.

7 Comments »

November 9, 2012

[2b2k] What do we learn from our failure to believe the polls?

There’s lots being written about why the Republicans were so wrong in their expectations about this week’s election. They had the same data as the rest of us, yet they apparently deeply believed they were going to win. I think it’s a fascinating question. But I want to put it to different use.

The left-wing subtext about the Republican leadership’s failure to interpret the data is that it’s comeuppance for their failure to believe in science or facts. But that almost surely is a misreading. The Republicans thought they had factual grounds for disbelieving the polls. The polls, they thought, were bad data that over-counted Democrats. The Republicans thus applied an unskewing algorithm in order to correct them. Thus, the Republicans weren’t pooh-poohing the importance of facts. They were being good scientists, cleaning up the data. Now, of course their assumptions about the skewing of the data were wrong, and there simply has to be an element of wish-fulfillment (and thus reality denial) in their belief that the polls were skewed. But, their arguments were based on what they thought was a fact about a problem with the data. They were being data-based. They just did a crappy job of it.

So what do we conclude? First, I think it’s important to recognize that it wasn’t just the Republicans who looked the data in the face and drew entirely wrong conclusions. Over and over the mainstream media told us that this race was close, that it was a toss-up. But it wasn’t. Yes, the popular vote was close, although not as close as we’d been led to believe. But the outcome of the race wasn’t a toss-up, wasn’t 50-50, wasn’t close. Obama won the race decisively and not very long after the last mainland polls closed…just as the data said he would. Not only was Nate Silver right, his record, his methodology, and the transparency of his methodology were good reasons for thinking he would be right. Yet, the mainstream media looked at the data and came to the wrong conclusion. It seems likely that they did so because they didn’t want to look like they were shilling for Obama and because they wanted to keep us attached to the TV for the sake of their ratings and ad revenues.

I think the media’s failure to draw the right and true conclusions from the data is a better example of a non-factual dodge around inconvenient truths than is the Republicans’ swerve.

Put the two failures together, and I think this is an example of the the inability of facts and data to drive us to agreement. Our temptation might be to look at both of these as fixable aberrations. I think a more sober assessment, however, should lead us to conclude that some significant portion of us is always going to find a way to be misled by facts and data. As a matter of empirical fact, data does not drive agreement, or at least doesn’t drive it sufficiently strongly that by itself it settles issues. For one reason or another, some responsible adults are going to get it wrong.

This doesn’t mean we should give up. It certainly doesn’t lead to a relativist conclusion. It instead leads to an acceptance of the fact that we are never going to agree, even when the data is good, plentiful, and right in front of our eyes. And, yeah, that’s more than a little scary.

7 Comments »

March 31, 2012

[2b2k] The commoditizing and networking of facts

Ars Technica has a post about Wikidata, a proposed new project from the folks that brought you Wikipedia. From the project’s introductory page:

Many Wikipedia articles contain facts and connections to other articles that are not easily understood by a computer, like the population of a country or the place of birth of an actor. In Wikidata you will be able to enter that information in a way that makes it processable by the computer. This means that the machine can provide it in different languages, use it to create overviews of such data, like lists or charts, or answer questions that can hardly be answered automatically today.

Because I had some questions not addressed in the Wikidata pages that I saw, I went onto the Wikidata IRC chat (http://webchat.freenode.net/?channels=#wikimedia-wikidata) where Denny_WMDE answered some questions for me.

[11:29] hi. I’m very interested in wikidata and am trying to write a brief blog post, and have a n00b question.

[11:29] go ahead!

[11:30] When there’s disagreement about a fact, will there be a discussion page where the differences can be worked through in public?

[11:30] two-fold answer

[11:30] 1. there will be a discussion page, yes

[11:31] 2. every fact can always have references accompanying it. so it is not about “does berlin really have 3.5 mio people” but about “does source X say that berlin has 3.5 mio people”

[11:31] wikidata is not about truth

[11:31] but about referenceable facts

When I asked which fact would make it into an article’s info box when the facts are contested, Denny_WMDE replied that they’re working on this, and will post a proposal for discussion.

So, on the one hand, Wikidata is further commoditizing facts: making them easier and thus less expensive to find and “consume.” Historically, this is a good thing. Literacy did this. Tables of logarithms did it. Almanacs did it. Wikipedia has commoditized a level of knowledge one up from facts. Now Wikidata is doing it for facts in a way that not only will make them easy to look up, but will enable them to serve as data in computational quests, such as finding every city with a population of at least 100,000 that has an average temperature below 60F.

On the other hand, because Wikidata is doing this commoditizing in a networked space, its facts are themselves links — “referenceable facts” are both facts that can be referenced, and simultaneously facts that come with links to their own references. This is what Too Big to Know calls “networked facts.” Those references serve at least three purposes: 1. They let us judge the reliability of the fact. 2. They give us a pointer out into the endless web of facts and references. 3. They remind us that facts are not where the human responsibility for truth ends.

4 Comments »

February 27, 2012

[2b2k] Moi

EconTalk has posted an hour interview with me by Russ Roberts about some of the topics in Too Big to Know that don’t come up so often.

Be the first to comment »

February 4, 2012

[2b2k] Moi moi moi

Steve Cottle has done a great job live-blogging my wrap-up talk at the Tech@State event. Thanks, Steve!

I was the guest on Tummelvision a couple of nights ago, which is podcast tumble-tumult of persons and ideas. It doesn’t get much more fun than that. Thanks, Heather, Kevin, and Deb!

The Berkman Center has posted the video of my book talk. Look on the bottom left to find the player and the links.

KMWorld’s Hugh McKellar has posted his interview with me.

And NYTECH has just posted a video of my talk there on Jan 25. The talk is about 45 mins and then there’s a lively Q&A. Thanks NY TECH!

Brandeins has posted an interview with Doc Searls and me about Cluetrain. (They translated it into German.)

Be the first to comment »

November 10, 2011

[2b2k] Census Bureau ends Statistical Abstract

The Census Bureau is no longer going to fund the creation of the Statistical Abstract of the United States, apparently in order to save $3M a year. As David Cay Johnston puts it:

Last year the online site was accessed 5.6 million times. If the absence of a Statistical Abstract increases search time by even two minutes, then the cost, based on the all-in average pay of reference librarians, will be about five times the federal savings. Were Congress to order up a cost-benefit study, the figure would be a loser, costing society at least $5 for every dollar of tax money saved.

Not to mention the symbolic slap in the face to supporting fact-based public discourse.

(The Census Bureau attempts to ameliorate this by pointing out that all the info is still available, dispersed across agencies and sources. Yeah, but if the Statistical Abstract ever had value — which it did — it’s because it aggregated data that can be difficult to chase down.)

Be the first to comment »

June 3, 2010

[pdf] Truth, factchecking and online media

Brendan Greeley is moderating a panel on truth and factchecking. He begins by wondering if we need argument-checking as well as fact-checking.

Bill Adair of Michelle Bachman.) They also check pundits. And they have an Obameter that tracks how Obama is delivering on his 500+ campaign promises. They have also begun state sites. “It’s a whole new form of journalism,” he concludes.

Brendan: Couldn’t you have done this before the Net? No, says Bill. You couldn’t do the research. Plus, the corrections would have only run in the one edition of the paper, and if you missed it, you would have missed it forever.

There is a jurisprudence to the Truthometer, Bill says. They’ve had to invent how to distinguish an “untrue” from a “pants on fire.”

Jay Rosen says that 58 yrs ago, Joe McCarthy exploited defects in the media to make a name for himself, at great cost. Charges are news. What happens today is news. Senators are worth reporting on and have some credibility. News can’t be untold. Eventually, the media figured out that they’d been exploited; the press had been put in the service of untruth. So, reporters changed the rules: It suddenly became ok to do “interpretation.” I.e., it was ok for them to point out that a public official might have another motive for what he said. Fifty years later, politicians are exploiting different weaknesses. The best known is “he said she said” journalism. That’s a response to the quest for innocence, i.e., a demonstration that you are neutral in the cultural/political wars. Rather than having an agenda for the left or right, the press has an innocence agenda. He-said-she-said also helps journalists make their deadlines: you don’t have time to interrupt, so you get someone to state the other side.

In December, Jay tweeted that Meet the Press ought to fact check its guests and run the results on Wednesday. ABC has started doing it, for ABC This Week. MtP has refused, possibly because the person who’s been the most frequent guest is John McCain, who Politifact rates as a pants-on-fire liar. But, some college students have put up MeetTheFacts.com to

Marc Ambinder says that he’s getting more comfortable going outside of traditional journalism’s box, and getting angry about being told to stay inside of it. E.g., there’s nothing to the story of the White House offering a job to Sestak, but the press is covering it as if it’s an issue. The solution is for reputable journalists to say that it isn’t a story and then covering something else, but you’re dealing with an entrenched set of habits.

Bill points to TechCrunch as his favorite voice on the Web, which, as he says, is strongly voiced and non-neutral. Jay says that it used to be that you lost credibility if you judged, but that has flipped. This is part of a culture war in which the press is an object of attack, Jay says.

Brendan says that Jay was right 5 yrs ago to say that the war between journalism and blogging is over. Now there’s the same sort of controversy over factchecking. How do we get past the conflict, Brendan asks. Bill says we need to get past the “bucket of quotes” mentality. Factchecking should be a standard part of the journalist’s toolkit. Jay says that the birther phenomenon is interesting. That Obama was born in the US is as verified as a fact can get. But, within politics, the overriding of that fact has given rise to a political movement. There is no journalistic response to this. They can’t treat it as a claim within the spectrum; it’s actually a repudiation of journalism. Marc and Jay agree that the remedy is not within journalism but within the political system: Republicans ought to shame the birthers.

Q: What about factchecking that goes wrong?
A: There’s still room for journalists.
A: (jay) Reputation systems work.
A: (brendan) But email is anonymous.

Q: Reputation systems can be gamed. And we need the Sunday shows to do the factchecking on the same episode so people can see it.
A: Yes. We’re seeing progress, but… ABC deserves credit.

Brendan: There’s selection bias in factchecking. Factcheckers decide what to count as worth checking?
Bill: Is it something that Mabel — our typical reader — would wonder about?
Jay: News orgs used to establish trust by advertising their viewlessness. now they need to say where they’re coming from.

2 Comments »

March 3, 2010

[ahole] [2b2k] Me having tea with The Economist

I have to say that Tea with the Economist was a fun experience. The Economist has been videoing tea-time discussions with various folks. In line with that magazine’s tradition of anonymous authoring, the interviewer is unnamed, but I can assure you that he is as astute as he is delightful.

We talk about what people will do with the big loads of data that some governments are releasing, and the general problem of the world being too big to know.

3 Comments »

January 2, 2010

<2b2k> Almost complete first draft of Chapter 1

And when I say “first draft,” what I actually mean is the fifth draft of the first draft. Even that’s not right, since I go through the chapter continuously, and create a new draft (or what I should perhaps call a “version”) whenever I’m about to make a big change I think I may regret.

Anyway, I think and hope that it’s in roughly the shape it needs to be in, although I’ll re-read it tomorrow and may decide to scrap it. And when I’ve finished the last chapter, I may well see that I need throw out this one and begin again. Life in the book writing biz.

There are definitely things I don’t like about the current version. For example, the beginning. And the ending. Also, some stuff in the middle.

The current draft begins with the question “If we didn’t have a word for knowledge, would we feel the need to create one?” I don’t answer that in this chapter. I’m thinking I’ll come back to it at the end of the book. Instead, I quickly go through some of the obvious reasons we’d answer “yes.” But then I need to suggest that the answer might be “no,” and I don’t think I do a good enough job on that. It’s difficult, because the whole book whittles away at that answer, so it’s hard to come up with a context-free couple of paragraphs that will do the job. I want this chapter to focus on the nature of knowledge as a structure, so I contrast traditional guide books with the open-endedness of the Web, hoping to suggest that knowledge has gotten too big to be thought of as structure or even as a realm. (I can only hint at this at this point.) But, the Web example seems so old hat to me that I even have to apologize for it in the text (“Just another day on the Web…”). I’d rather open by having me in some actual place that I can write about — someplace where I can point to obvious features that are only obvious because we make non-obvious assumptions about the finitude, structure, and know-ability of knowledge. A library? I’d like to think of something more novel.

Since I last updated this blog about my “progress,” I’ve added a section on the data-information-knowledge-wisdom hierarchy, which traces back to T.S. Eliot. I glom onto some of the definitions of “knowledge” proposed by those who promulgate that hierarchy and point out that they have little to do with what we usually mean by knowledge (and what Eliot meant by it); rather they slap the label “knowledge” on whatever seems to be the justification for investing in information processing equipment. I then swerve from giving my own definition — a swerve I should justify more explicitly — and instead spend some time describing the nature of traditional knowledge. The result of that section is that we think of knowledge as something built on firm foundations. These days, we take facts as the bricks of knowledge. But it wasn’t always so. And that I hope leads the reader smoothly enough into a discussion of the history of fact-based knowledge (which I’m maintaining really came into its own in the early 19th century British social reform movement).

I also added a brief bit about what non-fact-based knowledge looked like. I’d already discussed the medieval idea of assembling knowledge based on analogies, but I wanted to give a more modern example. So, I looked at Malthus, whose big book came out in 1798. I was disappointed to find that Malthus’ book is full of learned discussions of statistics and facts, and thus not only wasn’t a suitable example but seemed to disprove my thesis. Then I realized I was looking at the 6th edition. Malthus revised and republished his book for the next thirty years or so. If you compare the 6th edition with the first, you are struck by how stat-free edition #1 is and how stat-full #6 is. The first edition is a deductive argument based on seemingly self-evident propositions. The support he gives for his conclusion is based on anthropological sketches and guesses about why various populations have been kept in check. The difference between #1 and #6 actually helps my case.

The last section now introduces the idea of “knowledge overload” (which is still distressingly vague and I may have to drop it) and foreshadows some of the changes that overload is bringing. I’m having trouble getting the foreshadowing right, though, since it requires stating themes that will take entire chapters to unpack.

So, having obsessively worked on this every day for the past few weeks with no days off from it, I’m going to let it sit for a day or two. I think I’ll start sketching Chapter 2.

6 Comments »

December 27, 2009

[2b2k] First draft of first chapter sort of done

[NOTE: These posts tagged "2b2k' (Too Big to Know") are about the process of writing a book. They therefore talk about the ideas in the book rather incidentally..]

It’s not quite right to say that I’ve finished a first draft of chapter one. More accurately: I’ve stopped typing and have gone back to the beginning. It needs so much work that it doesn’t even constitute a draft.

I read it to our son last night as he trotted on the elliptical trainer in the basement. He thought it’s better than I do, but that’s why we have families. He also offered useful comments: Opening with a recitation of factoids about the growth of info has been done (although he professed to find it amusing); I say three or four times too often that the basics of knowledge are changing; it wasn’t entirely clear how the idea of information overload has gone from a psychological syndrome to a cultural challenge. All too true.

Hearing it out loud helps a lot; I always read drafts of chapters to my wife. I realized, for example, that the long (too long) section on the history of facts adopts an off-putting academic tone. That doesn’t worry me, because adjusting the tone is a normal part of re-writing, although it does require the painful removal of “good stuff” that actually isn’t very interesting. I remain quite concerned about the overall structure, and, worse, whether the chapter is clear in its readerly aims.

So, I’m going to put in a new opening. Although the technique is overdone and predictable, I will probably start with some very quick examples intended to show that knowledge is becoming networked. Then I will tighten the section on information overload, which aims at suggesting that knowledge overload results in a change in the nature of knowledge (in a way that info overload did not change the nature of info). Then, into the reduced section on the history of facts, which aims to challenge our notion that knowledge is a building that depends on having a firm foundation. (I also want to shake the reader by the shoulders and say that the idea of knowledge is not as obvious and eternal as we’ve thought.)

Also, I changed the title of Chapter One yesterday, from “Undoing Knowledge” to “The Great Unnailing.”

And, this morning, while on the ol’ elliptical, I read a review of Amartya Sen’s The Idea of Justice, which, because of its discussion of the inevitability of disagreements, seems like it might be relevant. A few paces on, it also seemed to me that a suitable ending for the book might be a brief section that asks: If we didn’t have a concept of knowledge, would we now invent one? Is that concept still useful? I mean something inchoate by this, for clearly it is useful to distinguish between reliable and unreliable ideas. But that’s always a matter of degree. Would we separate out a special class of specially reliable information, and, more to the point, would we think of it as a realm of truth, a mirror of nature, or our highest calling? I think not. But I don’t know if this is an idea with which to open the book, close the book, or ignore.

3 Comments »

Next Page »


Switch to our mobile site