I don’t care about expensive electric sports cars, but I’m fascinated by the dustup between Elon Musk and the New York Times.
On Sunday, the Times ran an article by John Broder on driving the Tesla S, an all-electric car made by Musk’s company, Tesla. The article was titled “Stalled Out on Tesla’s Electric Highway,” which captured the point quite concisely.
Musk on Wednesday in a post on the Tesla site contested Broder’s account, and revealed that every car Tesla lends to a reviewer has its telemetry recorders set to 11. Thus, Musk had the data that proved that Broder was driving in a way that could have no conceivable purpose except to make the Tesla S perform below spec: Broder drove faster than he claimed, drove circles in a parking lot for a while, and didn’t recharge the car to full capacity.
Boom! Broder was caught red-handed, and it was data that brung him down. The only two questions left were why did Broder set out to tank the Tesla, and would it take hours or days for him to be fired?
Rebecca Greenfield at Atlantic Wire took a close look at the data — at least at the charts and maps that express the data — and evaluated how well they support each of Musk’s claims. Overall, not so much. The car’s logs do seem to contradict Broder’s claim to have used cruise control. But the mystery of why Broder drove in circles in a parking lot seems to have a reasonable explanation: he was trying to find exactly where the charging station was in the service center.
But we’re not done. Commenters on the Atlantic piece have both taken it to task and provided some explanatory hypotheses. Greenfield has interpolated some of the more helpful ones, as well as updating her piece with testimony from the tow-truck driver, and more.
But we’re still not done. Margaret Sullivan [twitter:sulliview] , the NYT “public editor” — a new take on what in the 1960s we started calling “ombudspeople” (although actually in the ’60s we called them “ombudsmen”) — has jumped into the fray with a blog post that I admire. She’s acting like a responsible adult by witholding judgment, and she’s acting like a responsible webby adult by talking to us even before all the results are in, acknowledging what she doesn’t know. She’s also been using social media to discuss the topic, and even to try to get Musk to return her calls.
Now, this whole affair is both typical and remarkable:
It’s a confusing mix of assertions and hypotheses, many of which are dependent on what one would like the narrative to be. You’re up for some Big Newspaper Schadenfreude? Then John Broder was out to do dirt to Tesla for some reason your own narrative can supply. You want to believe that old dinosaurs like the NYT are behind the curve in grasping the power of ubiquitous data? Yup, you can do that narrative, too. You think Elon Musk is a thin-skinned capitalist who’s willing to destroy a man’s reputation in order to protect the Tesla brand? Yup. Or substitute “idealist” or “world-saving environmentally-aware genius,” and, yup, you can have that narrative too.
Not all of these narratives are equally supported by the data, of course — assuming you trust the data, which you may not if your narrative is strong enough. Data signals but never captures intention: Was Broder driving around the parking lot to run down the battery or to find a charging station? Nevertheless, the data do tell us how many miles Broder drove (apparently just about the amount that he said) and do nail down (except under the most bizarre conspiracy theories) the actual route. Responsible adults like you and me are going to accept the data and try to form the story that “makes the most sense” around them, a story that likely is going to avoid attributing evil motives to John Broder and evil conspiratorial actions by the NYT.
But the data are not going to settle the hash. In fact, we already have the relevant numbers (er, probably) and yet we’re still arguing. Musk produced the numbers thinking that they’d bring us to accept his account. Greenfield went through those numbers and gave us a different account. The commenters on Greenfield’s post are arguing yet more, sometimes casting new light on what the data mean. We’re not even close to done with this, because it turns out that facts mean less than we’d thought and do a far worse job of settling matters than we’d hoped.
That’s depressing. As always, I am not saying there are no facts, nor that they don’t matter. I’m just reporting empirically that facts don’t settle arguments the way we were told they would. Yet there is something profoundly wonderful and even hopeful about this case that is so typical and so remarkable.
Margaret Sulllivan’s job is difficult in the best of circumstances. But before the Web, it must have been so much more terrifying. She would have been the single point of inquiry as the Times tried to assess a situation in which it has deep, strong vested interests. She would have interviewed Broder and Musk. She would have tried to find someone at the NYT or externally to go over the data Musk supplied. She would have pronounced as fairly as she could. But it would have all been on her. That’s bad not just for the person who occupies that position, it’s a bad way to get at the truth. But it was the best we could do. In fact, most of the purpose of the public editor/ombudsperson position before the Web was simply to reassure us that the Times does not think it’s above reproach.
Now every day we can see just how inadequate any single investigator is for any issue that involves human intentions, especially when money and reputations are at stake. We know this for sure because we can see what an inquiry looks like when it’s done in public and at scale. Of course lots of people who don’t even know that they’re grinding axes say all sorts of mean and stupid things on the Web. But there are also conversations that bring to bear specialized expertise and unusual perspectives, that let us turn the matter over in our hands, hold it up to the light, shake it to hear the peculiar rattle it makes, roll it on the floor to gauge its wobble, sniff at it, and run it through sophisticated equipment perhaps used for other purposes. We do this in public — I applaud Sullivan’s call for Musk to open source the data — and in response to one another.
Our old idea was that the thoroughness of an investigation would lead us to a conclusion. Sadly, it often does not. We are likely to disagree about what went on in Broder’s review, and how well the Tesla S actually performed. But we are smarter in our differences than we ever could be when truth was a lonelier affair. The intelligence isn’t in a single conclusion that we all come to — if only — but in the linked network of views from everywhere.
There is a frustrating beauty in the way that knowledge scales.
Tagged with: 2b2k
Date: February 14th, 2013 dw
I picked up a copy of Bernard Knox’s 1994 Backing into the Future because somewhere I saw it referenced about the weird fact that the ancient Greeks thought that the future was behind them. Knox presents evidence from The Odyssey and Oedipus the King to back this up, so to speak. But that’s literally on the first page of the book. The rest of it consists of brilliant and brilliantly written essays about ancient life and scholarship. Totally enjoyable.
True, he undoes one of my favorite factoids: that Greeks in Homer’s time did not have a concept of the body as an overall unity, but rather only had words for particular parts of the body. This notion comes most forcefully from Bruno Snell in The Discovery of Mind, although I first read about it — and was convinced — by a Paul Feyerabend essay. In his essay “What Did Achilles Look Like?,” Knox convincingly argues that the Greeks had both and a word and concept for the body as a unity. In fact, they may have had three. Knox then points to Homeric uses that seem to indicate, yeah, Homer was talking about a unitary body. E.g., “from the bath he [Oydsseus] stepped, in body [demas] like the immortals,” and Poseidon “takes on the likeness of Calchas, in bodily form,” etc. [p. 52] I don’t read Greek, so I’ll believe whatever the last expert tells me, and Knox is the last expert I’ve read on this topic.
In a later chapter, Knox comes back to Bernard William’s criticism, in Shame and Necessity, of the “Homeric Greeks had no concept of a unitary body” idea, and also discusses another wrong thing that I had been taught. It turns out that the Greeks did have a concept of intention, decision-making, and will. Williams argues that they may not have had distinct words for these things, but Homer “and his characters make distinctions that can only be understood in terms of” those concepts. Further, Williams writes that Homer has
no word that means, simply, “decide.” But he has the notion…All that Homer seems to have left out is the idea of another mental action that is supposed necessarily to lie between coming to a conclusion and acting on it: and he did well in leaving it out, since there is no such action, and the idea of it is the invention of bad philosophy.” [p. 228]
Wow. Seems pretty right to me. What does the act of “making a decision” add to the description of how we move from conclusion to action?
Knox also has a long appreciation of Martha Nussbaum’s The Fragility of Goodness (1986) which makes me want to go out and get that book immediately, although I suspect that Knox is making it considerably more accessible than the original. But it sounds breath-takingly brilliant.
Knox’s essay on Nussbaum, “How Should We Live,” is itself rich with ideas, but one piece particularly struck me. In Book 6 of the Nichomachean Ethics, Aristotle dismisses one of Socrates’ claims (that no one knowingly does evil) by saying that such a belief is “manifestly in contradiction with the phainomena.” I’ve always heard the word “phainomena” translated in (as Knox says) Baconian terms, as if Aristotle were anticipating modern science’s focus on the facts and careful observation. We generally translate phainomena as “appearances” and contrast it with reality. The task of the scientist and the philosopher is to let us see past our assumptions to reveal the thing as it shows itself (appears) free of our anticipations and interpretations, so we can then use those unprejudiced appearances as a guide to truths about reality.
But Nussbaum takes the word differently, and Knox is convinced. Phainomena, are “the ordinary beliefs and sayings” and the sayings of the wise about things. Aristotle’s method consisted of straightening out whatever confusions and contradictions are in this body of beliefs and sayings, but then to show that at least the majority of those beliefs are true. This is a complete inversion of what I’d always thought. Rather than “attending to appearances” meaning dropping one’s assumptions to reveal the thing in its untouched state, it actually means taking those assumptions — of the many and of the wise — as containing truth. It is a confirming activity, not a penetrating and an overturning. Nussbaum says for Aristotle (and in contrast to Plato), “Theory must remain committed to the ways human beings live, act, see.” (Note that it’s entirely possible I’m getting Aristotle, Nussbaum, and Knox wrong. A trifecta of misunderstanding!)
Nussbaum’s book sounds amazing, and I know I should have read it, oh, 20 years ago, but it came out the year I left the philosophy biz. And Knox’s book is just wonderful. If you ever doubted why we need scholars and experts — why would you think such a thing? — this book is a completely enjoyable reminder.
I’m not sure how I came into possession of a copy of The Indexer, a publication by the Society of Indexers, but I thoroughly enjoyed it despite not being a professional indexer. Or, more exactly, because I’m not a professional indexer. It brings me joy to watch experts operate at levels far above me.
The issue of The Indexer I happen to have — Vol. 30, No,. 1, March 2012 — focuses on digital trends, with several articles on the Semantic Web and XML-based indexes as well as several on broad trends in digital reading and digital books, and on graphical visualizations of digital indexes. All good.
I also enjoyed a recurring feature: Indexes reviewed. This aggregates snippets of book reviews that mention the quality of the indexes. Among the positive reviews, the Sunday Telegraph thinks that for the book My Dear Hugh, “the indexer had a better understanding of the book than the editor himself.” That’s certainly going on someone’s resumé!
I’m not sure why I enjoy works of expertise in fields I know little about. It’s true that I know a little about indexing because I’ve written about the organization of digital information, and even a little about indexing. And I have a lot of interest in the questions about the future of digital books that happen to be discussed in this particular issue of The Indexer. That enables me to make more sense of the journal than might otherwise be the case. But even so, what I enjoy most are the discussions of topics that exhibit the professionals’ deep involvement in their craft.
But I think what I enjoy most of all is the discovery that something as seemingly simple as generating an index turns out to be indefinitely deep. There are endless technical issues, but also fathomless questions of principle. There’s even indexer humor. For example, one of the index reviews notes that Craig Brown’s The Lost Diaries “gives references with deadpan precision (‘Greer, Germaine: condemns Queen, 13-14…condemns pineapple, 70…condemns fat, thin and medium sized women, 93…condemns kangaroos,122′).”
As I’ve said before, everything is interesting if observed at the right level of detail.
From TheHeart.org, an article by Lisa Nainggolan:
Gothenburg, Sweden – Further support for the concept of the obesity paradox has come from a large study of patients with acute coronary syndrome (ACS) in the Swedish Coronary Angiography and Angioplasty Registry (SCAAR) . Those who were deemed overweight or obese by body-mass index (BMI) had a lower risk of death after PCI [percutaneous coronary intervention, aka angioplasty] than normal-weight or underweight participants up to three years after hospitalization, report Dr Oskar Angerås (University of Gothenburg, Sweden) and colleagues in their paper, published online September 5, 2012 in the European Heart Journal.
Can confirm. My grandmother in the 1930s was instructed to make sure she fed her husband lots and lots of butter to lubricate his heart after a heart attack. This proved to work extraordinarily well, at least until his next heart attack.
I refer once again to the classic 1999 The Onion headline: Eggs Good for You This Week.
, too big to know
Tagged with: 2b2k
Date: September 10th, 2012 dw
Ars Technica has a post about Wikidata, a proposed new project from the folks that brought you Wikipedia. From the project’s introductory page:
Many Wikipedia articles contain facts and connections to other articles that are not easily understood by a computer, like the population of a country or the place of birth of an actor. In Wikidata you will be able to enter that information in a way that makes it processable by the computer. This means that the machine can provide it in different languages, use it to create overviews of such data, like lists or charts, or answer questions that can hardly be answered automatically today.
Because I had some questions not addressed in the Wikidata pages that I saw, I went onto the Wikidata IRC chat (http://webchat.freenode.net/?channels=#wikimedia-wikidata) where Denny_WMDE answered some questions for me.
[11:29] hi. I’m very interested in wikidata and am trying to write a brief blog post, and have a n00b question.
[11:29] go ahead!
[11:30] When there’s disagreement about a fact, will there be a discussion page where the differences can be worked through in public?
[11:30] two-fold answer
[11:30] 1. there will be a discussion page, yes
[11:31] 2. every fact can always have references accompanying it. so it is not about “does berlin really have 3.5 mio people” but about “does source X say that berlin has 3.5 mio people”
[11:31] wikidata is not about truth
[11:31] but about referenceable facts
When I asked which fact would make it into an article’s info box when the facts are contested, Denny_WMDE replied that they’re working on this, and will post a proposal for discussion.
So, on the one hand, Wikidata is further commoditizing facts: making them easier and thus less expensive to find and “consume.” Historically, this is a good thing. Literacy did this. Tables of logarithms did it. Almanacs did it. Wikipedia has commoditized a level of knowledge one up from facts. Now Wikidata is doing it for facts in a way that not only will make them easy to look up, but will enable them to serve as data in computational quests, such as finding every city with a population of at least 100,000 that has an average temperature below 60F.
On the other hand, because Wikidata is doing this commoditizing in a networked space, its facts are themselves links — “referenceable facts” are both facts that can be referenced, and simultaneously facts that come with links to their own references. This is what Too Big to Know calls “networked facts.” Those references serve at least three purposes: 1. They let us judge the reliability of the fact. 2. They give us a pointer out into the endless web of facts and references. 3. They remind us that facts are not where the human responsibility for truth ends.
, too big to know
Tagged with: 2b2k
• big data
Date: March 31st, 2012 dw
Scott F. Johnson has posted a dystopic provocation about the present of digital scholarship and possibly about its future.
Here’s the crux of his argument:
… as the deluge of information increases at a very fast pace — including both the digitization of scholarly materials unavailable in digital form previously and the new production of journals and books in digital form — and as the tools that scholars use to sift, sort, and search this material are increasingly unable to keep up — either by being limited in terms of the sheer amount of data they can deal with, or in terms of becoming so complex in terms of usability that the average scholar can’t use it — then the less likely it will be that a scholar can adequately cover the research material and write a convincing scholarly narrative today.
Thus, I would argue that in the future, when the computational tools (whatever they may be) eventually develop to a point of dealing profitably with the new deluge of digital scholarship, the backward-looking view of scholarship in our current transitional period may be generally disparaging. It may be so disparaging, in fact, that the scholarship of our generation will be seen as not trustworthy, or inherently compromised in some way by comparison with what came before (pre-digital) and what will come after (sophisticatedly digital).
Scott tentatively concludes:
For the moment one solution is to read less, but better. This may seem a luddite approach to the problem, but what other choice is there?
First, I should point out that the rest of Scott’s post makes it clear that he’s no Luddite. He understands the advantages of digital scholarship. But I look at this a little differently.
I agree with most of Scott’s description of the current state of digital scholarship and with the inevitability of an ever increasing deluge of scholarly digital material. But, I think the issue is not that the filters won’t be able to keep up with the deluge. Rather, I think we’re just going to have to give up on the idea of “keeping up” — much as newspapers and half hour news broadcasts have to give up the pretense that they are covering all the day’s events. The idea of coverage was always an internalization of the limitation of the old media, as if a newspaper, a broadcast, or even the lifetime of a scholar could embrace everything important there is to know about a field. Now the Net has made clear to us what we knew all along: most of what knowledge wanted to do was a mere dream.
So, for me the question is what scholarship and expertise look like when they cannot attain a sense of mastery by artificial limiting the material with which they have to deal. It was much easier when you only had to read at the pace of the publishers. Now you’d have to read at the pace of the writers…and there are so many more writers! So, lacking a canon, how can there be experts? How can you be a scholar?
I’m bad at predicting the future, and I don’t know if Scott is right that we will eventually develop such powerful search and filtering tools that the current generation of scholars will look betwixt-and-between fools (or as an “asterisk,” as Scott says). There’s an argument that even if the pace of growth slows, the pace of complexification will increase. In any case, I’d guess that deep scholars will continue to exist because that’s more a personality trait than a function of the available materials. For example, I’m currently reading Armies of Heaven, by Jay Rubenstein. The depth of his knowledge about the First Crusade is astounding. Astounding. As more of the works he consulted come on line, other scholars of similar temperament will find it easier to pursue their deep scholarship. They will read less and better not as a tactic but because that’s how the world beckons to them. But the Net will also support scholars who want to read faster and do more connecting. Finally (and to me most interestingly) the Net is already helping us to address the scaling problem by facilitating the move of knowledge from books to networks. Books don’t scale. Networks do. Although, yes, that fundamentally changes the nature of knowledge and scholarship.
[Note: My initial post embedded one draft inside another and was a total mess. Ack. I've cleaned it up - Oct. 26, 2011, 4:03pm edt.]
, too big to know
Tagged with: 2b2k
• open access
Date: October 26th, 2011 dw
I’ve come to love Reddit. What started as a better Digg (and is yet another happy outcome of the remarkable Y Combinator) has turned into a way of sharing and interrogating news. Reddit as it stands is not the future of news. It is, however, a hope for news.
As at other sites, at Reddit readers post items they find interesting. Some come from the media, but many are home-made ideas, photos, drawings, videos, etc. You can vote them up or down, resulting in a list ordered by collective interests. Each is followed by threaded conversations, and those comments are also voted up or down.
It’s not clear why Reddit works so well, but it does. The comments in particular are often fiercely insightful or funny, turning into collective, laugh-out-loud riffs. Perhaps it helps that the ethos — the norm — is that comments are short. Half-tweets. You can go on for paragraphs if you want, but you’re unlikely to be up-voted if you do. The brevity of the individual comments can give them a pithiness that paragraphs would blunt, and the rapid threading of responses can quickly puncture inflated ideas or add unexpected perspectives.
But more relevant to the future of news are the rhetorical structures that Reddit has given names to. They’re no more new than Frequently Asked Questions are, but so what? FAQs have become a major new rhetorical form, of unquestioned value, because they got a name. Likewise TIL, IAMA, and AMA are hardly startling in their novelty, but they are pretty amazing in practice.
TIL = Today I Learned. People post an answer to a question you didn’t know you had, or a fact that counters your intuition. They range from the trivial (“TIL that Gilbert Gottfried has a REAL voice.”) to the opposite of the trivial (“TIL there is a US owned Hydrogen bomb that has been missing off the coast of Georga for over 50 years. “)
IAMA = I Am A. AMA = Ask Me Anything. People offer to answer questions about whatever it is that they are. Sometimes they are famous people, but more often they are people in circumstances we’re curious about: a waiter at an upscale restaurant, a woman with something like Elephant Man’s disease, a miner, or this morning’s: “IAmA guy who just saw the final Harry Potter movie without reading/watching any Harry Potter material beforehand. Being morbidly confused, I made up an entire previous plot for the movie to make sense in my had. I will answer your HP Series question based on the made up previous plot in my head AMA.” The invitation to Ask Me Anything typically unfetters the frankest of questions. It helps that Reddit discourages trolling and amidst the geeky cynicism permits honest statements of admiration and compassion.
The topics of IAMA’s are themselves instructive. Many are jokes: “IAmA person who has finished a whole tube of chapstick without losing it. AMA” But many enable us to ask questions that would falter in the face of conventional propriety: “IAmA woman married to a man with Asperger’s Syndrome AMA”. Some open up for inquiry a perspective that we take for granted or that was too outside our normal range of consideration: “IAMA: I was a German child during WWII that was in the Hitler Youth and had my city bombed by the U.S.”
Reddit also lets readers request an IAMA. For example, someone is asking if one of Michelle Bachman’s foster kids would care to engage. Might be interesting, don’t you think?
So, my hypothesis is that IAMA and AMA are an important type of citizen journalism. Call it “community journalism.”
Now, if you’ve clicked through to any of these IAMA’s, you may be disappointed at the level of “journalism” you’ve seen. For example, look at yesterday’s “IAMA police officer who was working during the London Riots. AMA.” Many of the comments are frivolous or off-topic. Most are responses to other comments, and many threads spin out into back-and-forth riffing that can be pretty damn funny. But it’s not exactly “60 Minutes.” So what? This is one way citizen journalism looks. At its best, it asks questions we all want asked, unearths questions we didn’t know we wanted asked, asks them more forthrightly than most American journalists dare, and gets better — more honest — answers than we hear from the mainstream media.
You can also see in the London police officer’s IAMA one of the main ways Reddit constitutes itself as a community: it binds itself together by common cultural references. The more obscure, the tighter the bond. For example, during the IAMA with the police officer in the London riots, someone asks if they’ve caught the guy who knocked over the trash can. This is an unlinked reference to a posting from a few days before of a spoof video of a middle class guy looking around an empty street and then casually knocking over a garbage can. The comments devolve into some silliness about arresting a sea gull for looting. The police officer threads right in:
[police officer] I do assure you we take it very seriously, however. Here, please have a Victim of Crime pack and a crime reference number. We will look into this issue as a matter of priority, and will send you a telegram in six-to-eight-weeks.
Telegram? Are you that cop who got transported back to the 1970s?
My friends call me Murphy.
Lawl, I’m watching RoboCop right now.
This community is both Reddit’s strength as a site, and its greatest weakness as a form of citizen journalism. Reddit illustrates why there are few quotes that simultaneously delight and scare me more than “If the news is important, it will find me.” This was uttered, according to Jane Buckingham (and reported in a 2008 Brian Stelter NY Times article) by a college student in a focus group. In my view, the quote would be more accurate if it read, “If the news is interesting to my social group, it will find me.” What’s interesting to a community is not enough to make us well informed because our community’s interests tend to be parochial and self-reinforcing. This is not so much a limitation of community as a way that communities constitute themselves.
And here’s where I think Reddit offers some hope.
First, it’s important to remember that Reddit is not intending to cover the news, even though its tag line is “The front page of the Internet.” It feels no responsibility to post and upvote a story simply because it is important. Rather, Reddit is a supplement to the news. If something is sufficiently covered by the mainstream — today the stock market went up dramatically, today the Supreme Court decided something — it exactly will not be covered as news at Reddit. Reddit is for what didn’t make it into the mainstream news. So, Reddit does not answer the question: How will we get news when the main stream dries up?
But it does make manifest a phenomenon that should take some of the gloom off our outlook. Take Reddit as a type of internet tabloid. Mainstream tabloids are sensationalistic: They indulge and enflame what are properly thought of as lower urges. But Reddit feeds and stimulates a curiosity about the world. It turns out that a miner —or a person who works at Subway — has a lot to tell us. It turns out that a steely British cop has a sense of humor. It turns out that American planes dropping bombs on a German city did not fly with halos over them. True, there’s a flood of trivial curios and tidbits at Reddit. Nevertheless, from mainstream tabloids you learn that humans are a weak and corrupt species that revels in the misfortunes of others. From Reddit you learn that we are creatures with a wild curiosity, indiscriminate in its fascinations. And you learn that we are a social species that takes little seriously and enjoys the multiplicity of refractions.
But is the curiosity exhibited at Reddit enough? I find this question rocks back and forth. The Reddit community constitutes itself through a set of references that belong to a particular group and that exclude those who just don’t get nods to Robocop. Yet it is a community that reaches for what is beyond its borders. Not far enough, sure. But it’s never far enough. Reddit’s interests are generally headed in the right direction: outward. Those interests often embrace more than what the mainstream has found room for. Still, the interests of any group are always going to reflect that group’s standpoint and self-filters. Reddit’s curiosity is unsystematic, opportunistic, and indiscriminate. You will not find all the news you need there. That’s why I say Reddit offers not a solution to the impeding News Hole, but a hope. The hope is that while communities are based on shared interests and thus are at least somewhat insular, some communities can generate an outward-bound curiosity that delights in the unabashed exploration of what we have taken for granted and in the discovery of that which is outside its same-old boundaries.
But then there is the inevitability triviality of Reddit. Reddit topics, no matter how serious, engender long arcs of wisecracks and silliness. But this too tells us something, this time about the nature of curiosity. One of the mistakes we’ve made in journalism and education is to insist that curiosity is a serious business. Perhaps not. Perhaps curiosity needs a sense of humor.
I’m at an education conference put on by CET in Tel Aviv. This is the second day of the conference. The opening session is on business models for supporting the webification of the educational system.
NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.
Eli Hurvitz (former deputy director of the Rothschild Foundation, the funder of CET) is the moderator. The speakers are Michael Jon Jensen (Dir of Strategic Web Communications, National Academies Press), Eric Frank (co-founder of Flat World Knowledge) and Sheizaf Rafaelli (Dir. of the Sagy Center for Internet Research at Haifa Univ.)
Michael Jensen says he began with computers in 1980, thinking that books would be online within 5 yrs. He spent three yearsat Project Muse (1995-8), but left because they were spending half their money on keeping people away from their content. He went to the National Academies Press (part of the National Academy of Science). The National Academies does about 200 reports a year, the result of studies by about 20 experts focused on some question. While there are many wonderful things about crowd-sourcing, he says, “I’m in favor of expertise. Facts and opinions on the Web are cheap…but expertise, expert perspective and sound analysis are costly.” E.g., that humans are responsible for climate change is not in doubt, should not be presented as if it were in doubt, and should not be crowd-sourced, he says.
The National Academy has 4,800 books online, all available to be read on line for free. (This includes an algorithmic skimmer that extacts the most important two-sentence chunk from every page.) [Now that should be crowd-sourced!] Since 2005, 65% are free for download in PDF. They get 1.4M visitors/month, each reading 7 page on average. But only 0.2% buy anything.
The National Academy Press’ goal is access and sustainability. In 2001, they did an experiment: When people were buying a book, they were offered a download of a PDF for 80% of the price, then 60%, then 40%, then for free. 42% took the free PDF. But it would have been too expensive to make all PDF’s free. The 65% that are now free PDFs are the “long tail” of books. “We are going to be in transition for the next 20 yrs.” Book sales have gone from 450,00/yr in 2002 to 175,000 in 2010. But, as they have given away more, they are disseminating about 850,000 units per year. “That means we’re fulfilling our publishing mission.” 260,000 people have opted in for getting notified of new books.
Michael goes through the available business options. NAP’s offerings are too broad for subscriptions. They will continue selling products. Authors fund some of the dissemination. And booksellers provide some revenue. There are different models for long-form content vs. articles vs. news vs. databases. Further, NAP has to provide multiple and new forms of content.
General lessons: Understand your mission. Make sure your strategy supports your mission. But digital strategies are a series of tactics. Design fot the future. and “The highest resolution is never enough…Never dumb down.” “The print-based mindset will work for the next few years, but is a long-term dead end.” “‘Free’ of some kind is required.” Understand your readers, and develop relationships with them. Go where the audiences are. “Continue experimenting.” There is no single best model. “We are living in content hyperabundance, and must compete with everything else in the world.”
Eric Frank of Flat World Knowledge (“the largest commercial publisher of” open source textbooks) says that old business models are holding us back from achieving what’s possible with the Net. He points to a “value gap” in the marketplace. Many college textbooks are $200. The pain is not evenly distributed. Half of college students are in 2 yr colleges, where the cost of textbooks can be close to their tuition costs. The Net is disrupting the text book market already, e.g.,through the online sale of used books, or text book rental models, or “piracy.” So, publishers are selling fewer units per year, and are raising pricves to protect their revenues. There’s a “vicious downward spiral,” making everyone more and more unhappy.
Flat World Knowledge has two business models. First, it puts textbooks through an editorial process, and publishes them under open licenses. They vet their authors, and peer review the books. They publish their books under a Creative Commons license (attribution, non-commercial, share-alike); they retain the copyright, but allow users to reuse, revise, remix, and redistribute them. They provide a customization platform that looks quite slick: re-order the table of content, add content, edit the content. It then generates multiple formats, including html, pdf, ePub, .mobi, digital Braille, .mp3. Students can choose the format that works best for them. The Web-based and versions for students with disabilities are free. They sell softwcover books ($35 fofr b&w, $70 for color) and the other formats. They also sell study guides, online quizzes, and flashcards. 44% read for free online. 66% purchase something: 33% print, 3% audiobooks, 17% print it yourself, 3% ebooks.
Second business model: They license all of their intellectual property to an institution that buys a site license at $20/student, who then get access to the material in every format. Paper publishers’ unit sales tend to zero out over just a few semesters as students turn to other ways of getting the book. Free World Knowledge’s unit sales tend to be steady. They pay authors 20% royalty (as opposed to a standard 13%), which results in higher cumulative revenues for the authors.
They currently have 112 authors (they launched in 2007 and published their first book in Spring 2009). 36 titles published; 42 in pipeline. Their costs are about a third of the industry and declining. Their time to market is about half of the traditionals (18 months vs. 40 months). 1,600 faculty have formally adopted their books, in 44 countries. Sales are growing at 320%. Their conversion rate of free to paid is currently at 61% and growing. They’ve raised $30M in venture capital. Bertelsmann has put in $15M. Random House today invested.
He ends by citing Kevin Kelly: The Net is a giant copy machine. When copies are super-abundant, and worthless. So, you need to seel stuff that can’t be copied. Kevin lists 8 things that can’t be copied: immediacy, personalization, interpretation (study aids), authenticity (what the prof wants you to read), accessibility, embodiment (print copy), patronage (people want to pay creators), findability. Future for FWK: p2p tutoring, user-generated marketplace, self-assessment embedded within the books, data sales. “Knowledge is the black gold of the 21st century.”
[Sheizaf Rafaelli's talk was excellent — primarily about what happens when books lose bindings — but he spoke very quickly, and the talk itself did not lend itself to livebloggery, in part because I was hearing it in translation, which required more listening and less typing. Sorry. His slides are here. ]
I got to attend the Digital Public Library of America‘s first workshop yesterday. It was an amazing experience that left me with the best kind of headache: Too much to think about! Too many possibilities for goodness!
Mainly because the Chatham House Rule was in effect, I tweeted instead of live-blogged; it’s hard to do a transcript-style live-blog when you’re not allowed to attribute words to people. (The tweet stream was quite lively.) Fortunately, John Palfrey, the head of the steering committee, did some high-value live-blogging, which you can find here: 1 2 3 4.
The DPLA is more of an intention than a plan. The DPLA is important because the intention is for something fundamentally liberating, the people involved have been thinking about and working on related projects for years, and the institutions carry a great deal of weight. So, if something is going to happen that requires widespread institutional support, this is the group with the best chance. The year of workshops that began yesterday aims at helping to figure out how the intention could become something real.
So, what is the intention? Something like: To bring the benefits of public libraries to every American. And there is, of course, no consensus even about a statement that broad. For example, the session opened with a discussion of public versus research libraries (with the “versus” thrown into immediate question). And, Terry Fisher at the very end of the day suggested that the DPLA ought to stand for a principle: Knowledge should be free and universally accessible. Throughout the course of the day, many other visions and pragmatic possibilities were raised by the sixty attendees. [Note: I've just violated the Chatham Rule by naming Terry, but I'm trusting he won't mind. Also, I very likely got his principle wrong. It's what I do.]
I came out of it invigorated and depressed at the same time. Invigorated: An amazing set of people, very significant national institutions ready to pitch in, an alignment on the value of access to the works of knowledge and culture. Depressed: The !@#$%-ing copyright laws are so draconian and, well, stupid, that it is hard to see how to take advantage of the new ways of connecting to ideas and to one another. As one well-known Internet archivist said, we know how to make works of the 19th and 21st centuries accessible, but the 20th century is pretty much lost: Anything created after 1923 will be in copyright about as long as there’s a Sun to read by, and the gigantic mass of works that are out of print, but the authors are dead or otherwise unreachable, is locked away as firmly as an employee restroom at a Disney theme park.
So, here are some of the issues we discussed yesterday that I found came home with me. Fortunately, most are not intractable, but all are difficult to resolve and, some, to implement:
Should the DPLA aggregate content or be a directory? Much of the discussion yesterday focused on the DPLA as an aggregation of e-works. Maybe. But maybe it should be more of a directory. That’s the approach taken by the European online library, Europeana. But being a directory is not as glamorous or useful. And it doesn’t use the combined heft of the participating institutions to drive more favorable licensing terms or legislative changes since it itself is not doing any licensing.
Who is the user? How generic? Does the DPLA have to provide excellent tools for scholars and researchers, too? (See the next question.)
Site or ecology? At one extreme, the DPLA could be nothing but a site where you find e-content. At the other extreme, it wouldn’t even have a site but would be an API-based development platform so that others can build sites that are tuned to specific uses and users. I think the room agrees that it has to do both, although people care differently about the functions. It will have to provide a convenient way for users to find ebooks, but I hope that it will have an incredibly robust and detailed API so that someone who wants to build a community-based browse-and-talk environment for scholars of the Late 19th Century French Crueller can. And if I personally had to decide between the DPLA being a site or metadata + protocols + APIs, I’d go with the righthand disjunct in a flash.
Should the DPLA aim at legislative changes? My sense of the room is that while everyone would like to see copyright heavily amended, DPLA needs to have a strategy for launching while working within existing law.
Should the DPLA only provide access to materials users can access for free? That meets much of what we expect from public libraries (although many local libraries do charge a little for DVDs), but it fails Terry Fisher’s principle. (I don’t mean to imply that everyone there agreed with Terry, btw.)
What should the DPLA do to launch quickly and well? The sense of the room was that it’s important that DPLA not get stuck in committee for years, but should launch something quickly. Unfortunately, the easiest stuff to launch with are public domain works, many of which are already widely available. There were some suggestions for other sources of public domain works, such as government documents. But, then the DPLA would look like a specialty library, instead of the first place people turn to when they want an e-book or other such content.
How to pay for it? There was little talk of business models yesterday, but it was a short day for a big topic. There were occasional suggestions, such as just outright buying e-books (rather than licensing them), in part to meet the library’s traditional role of preserving works as well as providing access to them.
How important is expert curation? There seemed to be a genuine divide — pretty much undiscussed, possibly because it’s a divisive topic — about the value of curation. A few people suggested quite firmly that expert curation is a core value provided by libraries: you go to the library because you know you can trust what is in it. I personally don’t see that scaling, think there are other ways of meeting the same need, and worry that the promise is itself illusory. This could turn out to be a killer issue. Who determines what gets into the DPLA (if the concept of there being an inside to the DPLA even turns out to make sense)?
Is the environment stable enough to build a DPLA? Much of the conversation during the workshop assumed that book and journal publishers are going to continue as the mediating centers of the knowledge industry. But, as with music publishers, much of the value of publishers has left the building and now lives on the Net. So, the DPLA may be structuring itself around a model that is just waiting to be disrupted. Which brings me to the final question I left wondering about:
How disruptive should the DPLA be? No one’s suggesting that the DPLA be a rootin’ tootin’ bay of pirates, ripping works out of the hands of copyright holders and setting them free, all while singing ribald sea shanties. But how disruptive can it be? On the one hand, the DPLA could be a portal to e-works that are safely out of copyright or licensed. That would be useful. But, if the DPLA were to take Terry’s principle as its mission — knowledge ought to be free and universally accessible — the DPLA would worry less about whether it’s doing online what libraries do offline, and would instead start from scratch asking: Given the astounding set of people and institutions assembled around this opportunity, what can we do together to make knowledge as free and universally accessible as possible? Maybe a library is not the best transformative model.
Of course, given the greed-based, anti-knowledge, culture-killing copyright laws, the fact may be that the DPLA simply cannot be very disruptive. Which brings me right back to my depression. And yet, exhilaration.
The DPLA wiki is here.
The deadline for my book is looming, but I spoke today with Michael Edson, Director of Web and New Media Strategy at the Smithsonian, and I’d love to include his idea for a Smithsonian Commons.
The Smithsonian Commons would make publicly available digital content and information drawn from the magnificent Smithsonian collections, allowing visitors to interact with it, repost it, add to it, and mash it up. It begins with being able to find everything about, say Theodore Roosevelt, that is currently dispersed across multiple connections and museums: photos, books, the original Teddy bear, recordings of the TR campaign song, a commemorative medal, a car named after him, contemporary paintings of his exploits, the chaps he wore on his ranch…But Michael is actually most enthusiastic about the “network effects” that can accrue to knowledge when you let lots of people add what they know, either on the Commons site itself or out across the whole linked Internet.
Smithsonian Commons goes way beyond putting online as much of our national museum as possible â€” which should be enough to justify its creation. It goes beyond bringing to bear everything curators, experts, and passionate visitors know to increase our understanding of what is there. By allowing us to discover connections, link in and out, and add ideas and knowledge, what used to be a “mere” collection will be an embedded part of countless webs of knowledge that in turn add value to one another. That is to say, we will be able to take up the objects of our heritage in ways that will make them more distinctly and uniquely ours than ever before.
Let’s hope Smithsonian Commons goes from idea to a national â€” global â€” center of ideas, creativity, knowledge, and learning.
Next Page »