logo
EverydayChaos
Everyday Chaos
Too Big to Know
Too Big to Know
Cluetrain 10th Anniversary edition
Cluetrain 10th Anniversary
Everything Is Miscellaneous
Everything Is Miscellaneous
Small Pieces cover
Small Pieces Loosely Joined
Cluetrain cover
Cluetrain Manifesto
My face
Speaker info
Who am I? (Blog Disclosure Form) Copy this link as RSS address Atom Feed

December 28, 2012

[2b2k][eim] Over my head

I’m not sure how I came into possession of a copy of The Indexer, a publication by the Society of Indexers, but I thoroughly enjoyed it despite not being a professional indexer. Or, more exactly, because I’m not a professional indexer. It brings me joy to watch experts operate at levels far above me.

The issue of The Indexer I happen to have — Vol. 30, No,. 1, March 2012 — focuses on digital trends, with several articles on the Semantic Web and XML-based indexes as well as several on broad trends in digital reading and digital books, and on graphical visualizations of digital indexes. All good.

I also enjoyed a recurring feature: Indexes reviewed. This aggregates snippets of book reviews that mention the quality of the indexes. Among the positive reviews, the Sunday Telegraph thinks that for the book My Dear Hugh, “the indexer had a better understanding of the book than the editor himself.” That’s certainly going on someone’s resumé!

I’m not sure why I enjoy works of expertise in fields I know little about. It’s true that I know a little about indexing because I’ve written about the organization of digital information, and even a little about indexing. And I have a lot of interest in the questions about the future of digital books that happen to be discussed in this particular issue of The Indexer. That enables me to make more sense of the journal than might otherwise be the case. But even so, what I enjoy most are the discussions of topics that exhibit the professionals’ deep involvement in their craft.

But I think what I enjoy most of all is the discovery that something as seemingly simple as generating an index turns out to be indefinitely deep. There are endless technical issues, but also fathomless questions of principle. There’s even indexer humor. For example, one of the index reviews notes that Craig Brown’s The Lost Diaries “gives references with deadpan precision (‘Greer, Germaine: condemns Queen, 13-14…condemns pineapple, 70…condemns fat, thin and medium sized women, 93…condemns kangaroos,122’).”

As I’ve said before, everything is interesting if observed at the right level of detail.

Tweet
Follow me

Categories: everythingIsMiscellaneous, experts, too big to know Tagged with: 2b2k • everythingIsMiscellaneous • experts • index Date: December 28th, 2012 dw

1 Comment »

September 10, 2012

Obesity is good for your heart

From TheHeart.org, an article by Lisa Nainggolan:

Gothenburg, Sweden – Further support for the concept of the obesity paradox has come from a large study of patients with acute coronary syndrome (ACS) in the Swedish Coronary Angiography and Angioplasty Registry (SCAAR) [1]. Those who were deemed overweight or obese by body-mass index (BMI) had a lower risk of death after PCI [percutaneous coronary intervention, aka angioplasty] than normal-weight or underweight participants up to three years after hospitalization, report Dr Oskar Angerås (University of Gothenburg, Sweden) and colleagues in their paper, published online September 5, 2012 in the European Heart Journal.

Can confirm. My grandmother in the 1930s was instructed to make sure she fed her husband lots and lots of butter to lubricate his heart after a heart attack. This proved to work extraordinarily well, at least until his next heart attack.

I refer once again to the classic 1999 The Onion headline: Eggs Good for You This Week.

Tweet
Follow me

Categories: experts, science, too big to know Tagged with: 2b2k • experts • medicine • obesity Date: September 10th, 2012 dw

Be the first to comment »

March 31, 2012

[2b2k] The commoditizing and networking of facts

Ars Technica has a post about Wikidata, a proposed new project from the folks that brought you Wikipedia. From the project’s introductory page:

Many Wikipedia articles contain facts and connections to other articles that are not easily understood by a computer, like the population of a country or the place of birth of an actor. In Wikidata you will be able to enter that information in a way that makes it processable by the computer. This means that the machine can provide it in different languages, use it to create overviews of such data, like lists or charts, or answer questions that can hardly be answered automatically today.

Because I had some questions not addressed in the Wikidata pages that I saw, I went onto the Wikidata IRC chat (http://webchat.freenode.net/?channels=#wikimedia-wikidata) where Denny_WMDE answered some questions for me.

[11:29] hi. I’m very interested in wikidata and am trying to write a brief blog post, and have a n00b question.

[11:29] go ahead!

[11:30] When there’s disagreement about a fact, will there be a discussion page where the differences can be worked through in public?

[11:30] two-fold answer

[11:30] 1. there will be a discussion page, yes

[11:31] 2. every fact can always have references accompanying it. so it is not about “does berlin really have 3.5 mio people” but about “does source X say that berlin has 3.5 mio people”

[11:31] wikidata is not about truth

[11:31] but about referenceable facts

When I asked which fact would make it into an article’s info box when the facts are contested, Denny_WMDE replied that they’re working on this, and will post a proposal for discussion.

So, on the one hand, Wikidata is further commoditizing facts: making them easier and thus less expensive to find and “consume.” Historically, this is a good thing. Literacy did this. Tables of logarithms did it. Almanacs did it. Wikipedia has commoditized a level of knowledge one up from facts. Now Wikidata is doing it for facts in a way that not only will make them easy to look up, but will enable them to serve as data in computational quests, such as finding every city with a population of at least 100,000 that has an average temperature below 60F.

On the other hand, because Wikidata is doing this commoditizing in a networked space, its facts are themselves links — “referenceable facts” are both facts that can be referenced, and simultaneously facts that come with links to their own references. This is what Too Big to Know calls “networked facts.” Those references serve at least three purposes: 1. They let us judge the reliability of the fact. 2. They give us a pointer out into the endless web of facts and references. 3. They remind us that facts are not where the human responsibility for truth ends.

Tweet
Follow me

Categories: experts, too big to know Tagged with: 2b2k • big data • facts • wikidata • wikipedia Date: March 31st, 2012 dw

4 Comments »

October 26, 2011

[2b2k] Will digital scholarship ever keep up?

Scott F. Johnson has posted a dystopic provocation about the present of digital scholarship and possibly about its future.

Here’s the crux of his argument:

… as the deluge of information increases at a very fast pace — including both the digitization of scholarly materials unavailable in digital form previously and the new production of journals and books in digital form — and as the tools that scholars use to sift, sort, and search this material are increasingly unable to keep up — either by being limited in terms of the sheer amount of data they can deal with, or in terms of becoming so complex in terms of usability that the average scholar can’t use it — then the less likely it will be that a scholar can adequately cover the research material and write a convincing scholarly narrative today.

…

Thus, I would argue that in the future, when the computational tools (whatever they may be) eventually develop to a point of dealing profitably with the new deluge of digital scholarship, the backward-looking view of scholarship in our current transitional period may be generally disparaging. It may be so disparaging, in fact, that the scholarship of our generation will be seen as not trustworthy, or inherently compromised in some way by comparison with what came before (pre-digital) and what will come after (sophisticatedly digital).

Scott tentatively concludes:

For the moment one solution is to read less, but better. This may seem a luddite approach to the problem, but what other choice is there?

First, I should point out that the rest of Scott’s post makes it clear that he’s no Luddite. He understands the advantages of digital scholarship. But I look at this a little differently.

I agree with most of Scott’s description of the current state of digital scholarship and with the inevitability of an ever increasing deluge of scholarly digital material. But, I think the issue is not that the filters won’t be able to keep up with the deluge. Rather, I think we’re just going to have to give up on the idea of “keeping up” — much as newspapers and half hour news broadcasts have to give up the pretense that they are covering all the day’s events. The idea of coverage was always an internalization of the limitation of the old media, as if a newspaper, a broadcast, or even the lifetime of a scholar could embrace everything important there is to know about a field. Now the Net has made clear to us what we knew all along: most of what knowledge wanted to do was a mere dream.

So, for me the question is what scholarship and expertise look like when they cannot attain a sense of mastery by artificial limiting the material with which they have to deal. It was much easier when you only had to read at the pace of the publishers. Now you’d have to read at the pace of the writers…and there are so many more writers! So, lacking a canon, how can there be experts? How can you be a scholar?

I’m bad at predicting the future, and I don’t know if Scott is right that we will eventually develop such powerful search and filtering tools that the current generation of scholars will look betwixt-and-between fools (or as an “asterisk,” as Scott says). There’s an argument that even if the pace of growth slows, the pace of complexification will increase. In any case, I’d guess that deep scholars will continue to exist because that’s more a personality trait than a function of the available materials. For example, I’m currently reading Armies of Heaven, by Jay Rubenstein. The depth of his knowledge about the First Crusade is astounding. Astounding. As more of the works he consulted come on line, other scholars of similar temperament will find it easier to pursue their deep scholarship. They will read less and better not as a tactic but because that’s how the world beckons to them. But the Net will also support scholars who want to read faster and do more connecting. Finally (and to me most interestingly) the Net is already helping us to address the scaling problem by facilitating the move of knowledge from books to networks. Books don’t scale. Networks do. Although, yes, that fundamentally changes the nature of knowledge and scholarship.

[Note: My initial post embedded one draft inside another and was a total mess. Ack. I’ve cleaned it up – Oct. 26, 2011, 4:03pm edt.]

Tweet
Follow me

Categories: experts, too big to know Tagged with: 2b2k • libraries • open access • scholarship Date: October 26th, 2011 dw

3 Comments »

August 13, 2011

Reddit and community journalism

I’ve come to love Reddit. What started as a better Digg (and is yet another happy outcome of the remarkable Y Combinator) has turned into a way of sharing and interrogating news. Reddit as it stands is not the future of news. It is, however, a hope for news.

As at other sites, at Reddit readers post items they find interesting. Some come from the media, but many are home-made ideas, photos, drawings, videos, etc. You can vote them up or down, resulting in a list ordered by collective interests. Each is followed by threaded conversations, and those comments are also voted up or down.

It’s not clear why Reddit works so well, but it does. The comments in particular are often fiercely insightful or funny, turning into collective, laugh-out-loud riffs. Perhaps it helps that the ethos — the norm — is that comments are short. Half-tweets. You can go on for paragraphs if you want, but you’re unlikely to be up-voted if you do. The brevity of the individual comments can give them a pithiness that paragraphs would blunt, and the rapid threading of responses can quickly puncture inflated ideas or add unexpected perspectives.

But more relevant to the future of news are the rhetorical structures that Reddit has given names to. They’re no more new than Frequently Asked Questions are, but so what? FAQs have become a major new rhetorical form, of unquestioned value, because they got a name. Likewise TIL, IAMA, and AMA are hardly startling in their novelty, but they are pretty amazing in practice.

TIL = Today I Learned. People post an answer to a question you didn’t know you had, or a fact that counters your intuition. They range from the trivial (“TIL that Gilbert Gottfried has a REAL voice.”) to the opposite of the trivial (“TIL there is a US owned Hydrogen bomb that has been missing off the coast of Georga for over 50 years. “)

IAMA = I Am A. AMA = Ask Me Anything. People offer to answer questions about whatever it is that they are. Sometimes they are famous people, but more often they are people in circumstances we’re curious about: a waiter at an upscale restaurant, a woman with something like Elephant Man’s disease, a miner, or this morning’s: “IAmA guy who just saw the final Harry Potter movie without reading/watching any Harry Potter material beforehand. Being morbidly confused, I made up an entire previous plot for the movie to make sense in my had. I will answer your HP Series question based on the made up previous plot in my head AMA.” The invitation to Ask Me Anything typically unfetters the frankest of questions. It helps that Reddit discourages trolling and amidst the geeky cynicism permits honest statements of admiration and compassion.

The topics of IAMA’s are themselves instructive. Many are jokes: “IAmA person who has finished a whole tube of chapstick without losing it. AMA” But many enable us to ask questions that would falter in the face of conventional propriety: “IAmA woman married to a man with Asperger’s Syndrome AMA”. Some open up for inquiry a perspective that we take for granted or that was too outside our normal range of consideration: “IAMA: I was a German child during WWII that was in the Hitler Youth and had my city bombed by the U.S.”

Reddit also lets readers request an IAMA. For example, someone is asking if one of Michelle Bachman’s foster kids would care to engage. Might be interesting, don’t you think?

So, my hypothesis is that IAMA and AMA are an important type of citizen journalism. Call it “community journalism.”

Now, if you’ve clicked through to any of these IAMA’s, you may be disappointed at the level of “journalism” you’ve seen. For example, look at yesterday’s “IAMA police officer who was working during the London Riots. AMA.” Many of the comments are frivolous or off-topic. Most are responses to other comments, and many threads spin out into back-and-forth riffing that can be pretty damn funny. But it’s not exactly “60 Minutes.” So what? This is one way citizen journalism looks. At its best, it asks questions we all want asked, unearths questions we didn’t know we wanted asked, asks them more forthrightly than most American journalists dare, and gets better — more honest — answers than we hear from the mainstream media.

You can also see in the London police officer’s IAMA one of the main ways Reddit constitutes itself as a community: it binds itself together by common cultural references. The more obscure, the tighter the bond. For example, during the IAMA with the police officer in the London riots, someone asks if they’ve caught the guy who knocked over the trash can. This is an unlinked reference to a posting from a few days before of a spoof video of a middle class guy looking around an empty street and then casually knocking over a garbage can. The comments devolve into some silliness about arresting a sea gull for looting. The police officer threads right in:

[police officer] I do assure you we take it very seriously, however. Here, please have a Victim of Crime pack and a crime reference number. We will look into this issue as a matter of priority, and will send you a telegram in six-to-eight-weeks.
permalinkparent

AmbroseChapel
Telegram? Are you that cop who got transported back to the 1970s?

[police officer]
My friends call me Murphy.

derpedatbirth
Lawl, I’m watching RoboCop right now.

This community is both Reddit’s strength as a site, and its greatest weakness as a form of citizen journalism. Reddit illustrates why there are few quotes that simultaneously delight and scare me more than “If the news is important, it will find me.” This was uttered, according to Jane Buckingham (and reported in a 2008 Brian Stelter NY Times article) by a college student in a focus group. In my view, the quote would be more accurate if it read, “If the news is interesting to my social group, it will find me.” What’s interesting to a community is not enough to make us well informed because our community’s interests tend to be parochial and self-reinforcing. This is not so much a limitation of community as a way that communities constitute themselves.

And here’s where I think Reddit offers some hope.

First, it’s important to remember that Reddit is not intending to cover the news, even though its tag line is “The front page of the Internet.” It feels no responsibility to post and upvote a story simply because it is important. Rather, Reddit is a supplement to the news. If something is sufficiently covered by the mainstream — today the stock market went up dramatically, today the Supreme Court decided something — it exactly will not be covered as news at Reddit. Reddit is for what didn’t make it into the mainstream news. So, Reddit does not answer the question: How will we get news when the main stream dries up?

But it does make manifest a phenomenon that should take some of the gloom off our outlook. Take Reddit as a type of internet tabloid. Mainstream tabloids are sensationalistic: They indulge and enflame what are properly thought of as lower urges. But Reddit feeds and stimulates a curiosity about the world. It turns out that a miner —or a person who works at Subway — has a lot to tell us. It turns out that a steely British cop has a sense of humor. It turns out that American planes dropping bombs on a German city did not fly with halos over them. True, there’s a flood of trivial curios and tidbits at Reddit. Nevertheless, from mainstream tabloids you learn that humans are a weak and corrupt species that revels in the misfortunes of others. From Reddit you learn that we are creatures with a wild curiosity, indiscriminate in its fascinations. And you learn that we are a social species that takes little seriously and enjoys the multiplicity of refractions.

But is the curiosity exhibited at Reddit enough? I find this question rocks back and forth. The Reddit community constitutes itself through a set of references that belong to a particular group and that exclude those who just don’t get nods to Robocop. Yet it is a community that reaches for what is beyond its borders. Not far enough, sure. But it’s never far enough. Reddit’s interests are generally headed in the right direction: outward. Those interests often embrace more than what the mainstream has found room for. Still, the interests of any group are always going to reflect that group’s standpoint and self-filters. Reddit’s curiosity is unsystematic, opportunistic, and indiscriminate. You will not find all the news you need there. That’s why I say Reddit offers not a solution to the impeding News Hole, but a hope. The hope is that while communities are based on shared interests and thus are at least somewhat insular, some communities can generate an outward-bound curiosity that delights in the unabashed exploration of what we have taken for granted and in the discovery of that which is outside its same-old boundaries.

But then there is the inevitability triviality of Reddit. Reddit topics, no matter how serious, engender long arcs of wisecracks and silliness. But this too tells us something, this time about the nature of curiosity. One of the mistakes we’ve made in journalism and education is to insist that curiosity is a serious business. Perhaps not. Perhaps curiosity needs a sense of humor.

Tweet
Follow me

Categories: culture, experts, journalism, social media, too big to know Tagged with: 2b2k • citizen journalism • journalism • media • reddit Date: August 13th, 2011 dw

38 Comments »

March 28, 2011

ePublishing business models

I’m at an education conference put on by CET in Tel Aviv. This is the second day of the conference. The opening session is on business models for supporting the webification of the educational system.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Eli Hurvitz (former deputy director of the Rothschild Foundation, the funder of CET) is the moderator. The speakers are Michael Jon Jensen (Dir of Strategic Web Communications, National Academies Press), Eric Frank (co-founder of Flat World Knowledge) and Sheizaf Rafaelli (Dir. of the Sagy Center for Internet Research at Haifa Univ.)

Michael Jensen says he began with computers in 1980, thinking that books would be online within 5 yrs. He spent three yearsat Project Muse (1995-8), but left because they were spending half their money on keeping people away from their content. He went to the National Academies Press (part of the National Academy of Science). The National Academies does about 200 reports a year, the result of studies by about 20 experts focused on some question. While there are many wonderful things about crowd-sourcing, he says, “I’m in favor of expertise. Facts and opinions on the Web are cheap…but expertise, expert perspective and sound analysis are costly.” E.g., that humans are responsible for climate change is not in doubt, should not be presented as if it were in doubt, and should not be crowd-sourced, he says.

The National Academy has 4,800 books online, all available to be read on line for free. (This includes an algorithmic skimmer that extacts the most important two-sentence chunk from every page.) [Now that should be crowd-sourced!] Since 2005, 65% are free for download in PDF. They get 1.4M visitors/month, each reading 7 page on average. But only 0.2% buy anything.

The National Academy Press’ goal is access and sustainability. In 2001, they did an experiment: When people were buying a book, they were offered a download of a PDF for 80% of the price, then 60%, then 40%, then for free. 42% took the free PDF. But it would have been too expensive to make all PDF’s free. The 65% that are now free PDFs are the “long tail” of books. “We are going to be in transition for the next 20 yrs.” Book sales have gone from 450,00/yr in 2002 to 175,000 in 2010. But, as they have given away more, they are disseminating about 850,000 units per year. “That means we’re fulfilling our publishing mission.” 260,000 people have opted in for getting notified of new books.

Michael goes through the available business options. NAP’s offerings are too broad for subscriptions. They will continue selling products. Authors fund some of the dissemination. And booksellers provide some revenue. There are different models for long-form content vs. articles vs. news vs. databases. Further, NAP has to provide multiple and new forms of content.

General lessons: Understand your mission. Make sure your strategy supports your mission. But digital strategies are a series of tactics. Design fot the future. and “The highest resolution is never enough…Never dumb down.” “The print-based mindset will work for the next few years, but is a long-term dead end.” “‘Free’ of some kind is required.” Understand your readers, and develop relationships with them. Go where the audiences are. “Continue experimenting.” There is no single best model. “We are living in content hyperabundance, and must compete with everything else in the world.”

 


Eric Frank of Flat World Knowledge (“the largest commercial publisher of” open source textbooks) says that old business models are holding us back from achieving what’s possible with the Net. He points to a “value gap” in the marketplace. Many college textbooks are $200. The pain is not evenly distributed. Half of college students are in 2 yr colleges, where the cost of textbooks can be close to their tuition costs. The Net is disrupting the text book market already, e.g.,through the online sale of used books, or text book rental models, or “piracy.” So, publishers are selling fewer units per year, and are raising pricves to protect their revenues. There’s a “vicious downward spiral,” making everyone more and more unhappy.

Flat World Knowledge has two business models. First, it puts textbooks through an editorial process, and publishes them under open licenses. They vet their authors, and peer review the books. They publish their books under a Creative Commons license (attribution, non-commercial, share-alike); they retain the copyright, but allow users to reuse, revise, remix, and redistribute them. They provide a customization platform that looks quite slick: re-order the table of content, add content, edit the content. It then generates multiple formats, including html, pdf, ePub, .mobi, digital Braille, .mp3. Students can choose the format that works best for them. The Web-based and versions for students with disabilities are free. They sell softwcover books ($35 fofr b&w, $70 for color) and the other formats. They also sell study guides, online quizzes, and flashcards. 44% read for free online. 66% purchase something: 33% print, 3% audiobooks, 17% print it yourself, 3% ebooks.

Second business model: They license all of their intellectual property to an institution that buys a site license at $20/student, who then get access to the material in every format. Paper publishers’ unit sales tend to zero out over just a few semesters as students turn to other ways of getting the book. Free World Knowledge’s unit sales tend to be steady. They pay authors 20% royalty (as opposed to a standard 13%), which results in higher cumulative revenues for the authors.

They currently have 112 authors (they launched in 2007 and published their first book in Spring 2009). 36 titles published; 42 in pipeline. Their costs are about a third of the industry and declining. Their time to market is about half of the traditionals (18 months vs. 40 months). 1,600 faculty have formally adopted their books, in 44 countries. Sales are growing at 320%. Their conversion rate of free to paid is currently at 61% and growing. They’ve raised $30M in venture capital. Bertelsmann has put in $15M. Random House today invested.

He ends by citing Kevin Kelly: The Net is a giant copy machine. When copies are super-abundant, and worthless. So, you need to seel stuff that can’t be copied. Kevin lists 8 things that can’t be copied: immediacy, personalization, interpretation (study aids), authenticity (what the prof wants you to read), accessibility, embodiment (print copy), patronage (people want to pay creators), findability. Future for FWK: p2p tutoring, user-generated marketplace, self-assessment embedded within the books, data sales. “Knowledge is the black gold of the 21st century.”

[Sheizaf Rafaelli’s talk was excellent — primarily about what happens when books lose bindings — but he spoke very quickly, and the talk itself did not lend itself to livebloggery, in part because I was hearing it in translation, which required more listening and less typing. Sorry. His slides are here. ]

Tweet
Follow me

Categories: business, education, experts, libraries, liveblog, open access, too big to know Tagged with: copyright • e-books • ebooks • publishing Date: March 28th, 2011 dw

5 Comments »

March 2, 2011

Questions from and for the Digital Public Library of America workshop

I got to attend the Digital Public Library of America‘s first workshop yesterday. It was an amazing experience that left me with the best kind of headache: Too much to think about! Too many possibilities for goodness!

Mainly because the Chatham House Rule was in effect, I tweeted instead of live-blogged; it’s hard to do a transcript-style live-blog when you’re not allowed to attribute words to people. (The tweet stream was quite lively.) Fortunately, John Palfrey, the head of the steering committee, did some high-value live-blogging, which you can find here: 1 2 3 4.

The DPLA is more of an intention than a plan. The DPLA is important because the intention is for something fundamentally liberating, the people involved have been thinking about and working on related projects for years, and the institutions carry a great deal of weight. So, if something is going to happen that requires widespread institutional support, this is the group with the best chance. The year of workshops that began yesterday aims at helping to figure out how the intention could become something real.

So, what is the intention? Something like: To bring the benefits of public libraries to every American. And there is, of course, no consensus even about a statement that broad. For example, the session opened with a discussion of public versus research libraries (with the “versus” thrown into immediate question). And, Terry Fisher at the very end of the day suggested that the DPLA ought to stand for a principle: Knowledge should be free and universally accessible. Throughout the course of the day, many other visions and pragmatic possibilities were raised by the sixty attendees. [Note: I’ve just violated the Chatham Rule by naming Terry, but I’m trusting he won’t mind. Also, I very likely got his principle wrong. It’s what I do.]

I came out of it invigorated and depressed at the same time. Invigorated: An amazing set of people, very significant national institutions ready to pitch in, an alignment on the value of access to the works of knowledge and culture. Depressed: The !@#$%-ing copyright laws are so draconian and, well, stupid, that it is hard to see how to take advantage of the new ways of connecting to ideas and to one another. As one well-known Internet archivist said, we know how to make works of the 19th and 21st centuries accessible, but the 20th century is pretty much lost: Anything created after 1923 will be in copyright about as long as there’s a Sun to read by, and the gigantic mass of works that are out of print, but the authors are dead or otherwise unreachable, is locked away as firmly as an employee restroom at a Disney theme park.

So, here are some of the issues we discussed yesterday that I found came home with me. Fortunately, most are not intractable, but all are difficult to resolve and, some, to implement:

Should the DPLA aggregate content or be a directory? Much of the discussion yesterday focused on the DPLA as an aggregation of e-works. Maybe. But maybe it should be more of a directory. That’s the approach taken by the European online library, Europeana. But being a directory is not as glamorous or useful. And it doesn’t use the combined heft of the participating institutions to drive more favorable licensing terms or legislative changes since it itself is not doing any licensing.

Who is the user? How generic? Does the DPLA have to provide excellent tools for scholars and researchers, too? (See the next question.)

Site or ecology? At one extreme, the DPLA could be nothing but a site where you find e-content. At the other extreme, it wouldn’t even have a site but would be an API-based development platform so that others can build sites that are tuned to specific uses and users. I think the room agrees that it has to do both, although people care differently about the functions. It will have to provide a convenient way for users to find ebooks, but I hope that it will have an incredibly robust and detailed API so that someone who wants to build a community-based browse-and-talk environment for scholars of the Late 19th Century French Crueller can. And if I personally had to decide between the DPLA being a site or metadata + protocols + APIs, I’d go with the righthand disjunct in a flash.

Should the DPLA aim at legislative changes? My sense of the room is that while everyone would like to see copyright heavily amended, DPLA needs to have a strategy for launching while working within existing law.

Should the DPLA only provide access to materials users can access for free? That meets much of what we expect from public libraries (although many local libraries do charge a little for DVDs), but it fails Terry Fisher’s principle. (I don’t mean to imply that everyone there agreed with Terry, btw.)

What should the DPLA do to launch quickly and well? The sense of the room was that it’s important that DPLA not get stuck in committee for years, but should launch something quickly. Unfortunately, the easiest stuff to launch with are public domain works, many of which are already widely available. There were some suggestions for other sources of public domain works, such as government documents. But, then the DPLA would look like a specialty library, instead of the first place people turn to when they want an e-book or other such content.

How to pay for it? There was little talk of business models yesterday, but it was a short day for a big topic. There were occasional suggestions, such as just outright buying e-books (rather than licensing them), in part to meet the library’s traditional role of preserving works as well as providing access to them.

How important is expert curation? There seemed to be a genuine divide — pretty much undiscussed, possibly because it’s a divisive topic — about the value of curation. A few people suggested quite firmly that expert curation is a core value provided by libraries: you go to the library because you know you can trust what is in it. I personally don’t see that scaling, think there are other ways of meeting the same need, and worry that the promise is itself illusory. This could turn out to be a killer issue. Who determines what gets into the DPLA (if the concept of there being an inside to the DPLA even turns out to make sense)?

Is the environment stable enough to build a DPLA? Much of the conversation during the workshop assumed that book and journal publishers are going to continue as the mediating centers of the knowledge industry. But, as with music publishers, much of the value of publishers has left the building and now lives on the Net. So, the DPLA may be structuring itself around a model that is just waiting to be disrupted. Which brings me to the final question I left wondering about:

How disruptive should the DPLA be? No one’s suggesting that the DPLA be a rootin’ tootin’ bay of pirates, ripping works out of the hands of copyright holders and setting them free, all while singing ribald sea shanties. But how disruptive can it be? On the one hand, the DPLA could be a portal to e-works that are safely out of copyright or licensed. That would be useful. But, if the DPLA were to take Terry’s principle as its mission — knowledge ought to be free and universally accessible — the DPLA would worry less about whether it’s doing online what libraries do offline, and would instead start from scratch asking: Given the astounding set of people and institutions assembled around this opportunity, what can we do together to make knowledge as free and universally accessible as possible? Maybe a library is not the best transformative model.

Of course, given the greed-based, anti-knowledge, culture-killing copyright laws, the fact may be that the DPLA simply cannot be very disruptive. Which brings me right back to my depression. And yet, exhilaration.

Go figure.

The DPLA wiki is here.

Tweet
Follow me

Categories: berkman, everythingIsMiscellaneous, experts, libraries, too big to know Tagged with: 2b2k • berkman • copyright • dpla • libraries • metadata Date: March 2nd, 2011 dw

4 Comments »

« Previous Page


Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
TL;DR: Share this post freely, but attribute it to me (name (David Weinberger) and link to it), and don't use it commercially without my permission.

Joho the Blog uses WordPress blogging software.
Thank you, WordPress!