June 21, 2013

[lodlam] Sean Thomas and Sands Fish on getting Open Access into the right hands

Sands Fish [twitter: sandsfish and Sean Thomas [twitter: sean_m_thomas] at MIT are interested in pursuing a project to see if the new wealth of Open Access research is getting into the hands of people who can use it to solve problems. What is the distribution of access to OA?

May 17, 2013

Lobby for FaceBook, Yahoo, NewsCorp and Elsevier opposes the White House Open Access order, among others

Peter Suber points out that FaceBook, Yahoo, Elsevier and Yahoo have joined the lobby that has issued a clarion call against open access that blurs the line between lies and gibberish. Peter blows the statements apart, leaving nothing but clean air and a whiff of ozone. is publicizing its monthly “iAWFUL” (Internet advocates watchlist for ugly laws) list of policies that it doesn’t like. The list has little to do with advocating for the Internet, and everything to do with supporting the interests of Internet businesses (“committed to tearing down barriers to e-commerce”). For example, this month’s iAWFUL list includes data breach notification bills and a CT bill that “would force publishers to sell digital books at ‘reasonable” prices to state libraries.” That’s in addition to opposing actions (including the recent epochal White House Memorandum) that support public access to research — often research that the public has paid for. But they have it all bollixed up.

What makes it more distressing, then, is that reputable journals, including Computerworld, CIO and PC World, are running NetChoice’s iAWFUL PR puffery.

Thankfully, Peter Suber is on the case.


April 9, 2013

Elsevier acquires Mendeley + all the data about what you read, share, and highlight

I liked the Mendeley guys. Their product is terrific — read your scientific articles, annotate them, be guided by the reading behaviors of millions of other people. I’d met with them several times over the years about whether our LibraryCloud project (still very active but undergoing revisions) could get access to the incredibly rich metadata Mendeley gathers. I also appreciated Mendeley’s internal conflict about the urge to openness and the need to run a business. They were making reasonable decisions, I thought. At they very least they felt bad about the tension :)

Thus I was deeply disappointed by their acquisition by Elsevier. We could have a fun contest to come up with the company we would least trust with detailed data about what we’re reading and what we’re attending to in what we’re reading, and maybe Elsevier wouldn’t win. But Elsevier would be up there. The idea of my reading behaviors adding economic value to a company making huge profits by locking scholarship behind increasingly expensive paywalls is, in a word, repugnant.

In tweets back and forth with Mendeley’s William Gunn [twitter: mrgunn], he assures us that Mendeley won’t become “evil” so long as he is there. I do not doubt Bill’s intentions. But there is no more perilous position than standing between Elsevier and profits.

I seriously have no interest in judging the Mendeley folks. I still like them, and who am I to judge? If someone offered me $45M (the minimum estimate that I’ve seen) for a company I built from nothing, and especially if the acquiring company assured me that it would preserve the values of that company, I might well take the money. My judgment is actually on myself. My faith in the ability of well-intentioned private companies to withstand the brute force of money has been shaken. After all this time, I was foolish to have believed otherwise.

MrGunn tweets: “We don’t expect you to be joyous, just to give us a chance to show you what we can do.” Fair enough. I would be thrilled to be wrong. Unfortunately, the real question is not what Mendeley will do, but what Elsevier will do. And in that I have much less faith.


I’ve been getting the Twitter handles of Mendeley and Elsevier wrong. Ack. The right ones: @Mendeley_com and @ElsevierScience. Sorry!


March 28, 2013

[annotations][2b2k] Rob Sanderson on annotating digitized medieval manuscripts

Rob Sanderson [twitter:@azaroth42] of Los Alamos is talking about annotating Medieval manuscripts.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

He says many Medieval manuscripts are being digitized. The Mellon Foundation is funding many such projects. But these have tended to reinvent the same tech, and have not been designed for interoperability with other projects. So the Digital Medieval Initiative was founded, with a long list of prestigious partners. They thought about what they’d like: distributed, linked data, interoperable, etc. For this they need a shared description format.

The traditional approach is annotate an image of a page. But it can be very difficult to know which images to annotate; he gives as an example a page that has fold-outs. “The naive assuption is that an image equals a page.” But there may be fragments, or only portions of the page have been digitized (e.g., the illuminations), etc. There may be multiple images on a page, revealed by multi-spectral imaging. There may be multiple orientations of the page, etc.

The solution? The canvas paradigm. A canvas is an empty space corresponding to the rectangle (or whatever) of the page. You allow rich resources to be associated with it, and allow users to comment. For this, they use Open Annotation. You can specify a choice of images. You can associate text with an area of the canvas. There are lots of different ways to visualize those comments: overlays, side-by-side, etc.

You can build hybrid pages. For example, and old scan might have a new color scan of its illustrations pointing at it. Or you could have a recorded performance of a piece of music pointing at the musical notation.

In summary, the SharedCanvas model uses open standards (HTML 5, Open Annotation, TEI, etc.) and can be implement distributed across reporsitories, encouraging engagement by domain experts.

November 22, 2012

Is Open Access only for rich countries?

From an email:

…an online discussion on Open Access (OA) from the perspective of the developing world.

Funded by DFID, through the Mobilising Knowledge for Development (MK4D) programme in the Institute for Development Studies at Sussex University, and managed through the African Commons project in South Africa and the Centre for Internet and Society in India, the discussion will be hosted on UNESCO’s WSIS Open Access Community Forum. This open access dialogue will provide a valuable space to discuss different perspectives on what open access means for the developing world and what it can offer.

There is compelling evidence which indicates that OA has finally entered mainstream discourse. Yet, in the developing world context there remain specific challenges and untapped opportunities for OA. A series of open access discussions aimed at developing world critical thinkers, activists and academics, seeks to explore insights and articulate opinion on OA in the developing world. Join us for stimulating debate!

Register here:

August 5, 2012

Open Access facts from Peter Suber

I’m enjoying my friend Peter Suber’s small book Open Access. He’s a very clear and concise writer, and of course he knows this topic better than anyone.

Here are some facts Peter mentions:

  • In 2008, Harvard subscribed to 98,900 serials. Yale subscribed to 73,900. “The best-funded research library in India…subscribed to 10,600.” And, Peter points out, some Sub-Saharan universities cannot afford to subscribe to any. (pp. 30-32) Way to make yourself smart, humanity!

  • “In 2010, Elsevier’s journal division had a profit margin of 35.7 percent while ExxonMobil had only 28.1 percent.” (p. 32)

  • The cost of journals has caused a dramatic decrease in the percentage of their budgets research libraries spend on books, from 44% in 1986 to 28% now. “Because academic libraries now buy fewer books, academic book publishers now accept fewer mauscripts…” (p. 33)

Peter’s book will help you understand better why you already favor Open Access.


June 6, 2012


I learned yesterday from Robin Wendler (who worked mightily on the project) that Harvard’s library catalog dataset of 12.3M records has been bulk downloaded a thousand times, excluding the Web spiderings. That seems like an awful lot to me, and makes me happy.

The library catalog dataset comprises bibliographic records of almost all of Harvard Library’s gigantic collection. It’s available under a CC 0 public domain license for bulk download, and can be accessed through an API via the DPLA’s prototype platform. More info here.

1 Comment »

February 9, 2012

[2b2k] The Federally-funded research should be open Act

The Federal Research Public Access Act has been reintroduced in the U.S. House. It would require federally-funded research to be made public within six months of publication (with security exceptions, natch). More here.

Go FRPAA! (Ok, not the catchiest slogan ever.)

January 8, 2012

[2b2k] Why Is Open-Internet Champion Darrell Issa Supporting an Attack on Open Science?

I’ve swiped the title of this post from Rebecca J. Rosen’s excellent post at The Atlantic. Darrell Issa has been generally good on open Internet issues, so why is he supporting a bill that would forbid the government from requiring researchers to openly post the results of their research? [Later that day: I revised the previous sentence, which was gibberish. Sorry.]

Rebecca cites danah boyd’s awesome post: Save Scholarly Ideas, Not the Publishing Industry (a rant). InfoDocket has a helpful roundup, including to Peter Suber’s Google+ discussion.

1 Comment »

November 19, 2011

[avignon] [2b2k] Robert Darnton on the history of copyright , open access, the dpla…

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

We begin with a report on a Ministerial meeting yesterday here on culture — a dialogue among the stakeholders on the Internet. [No users included, I believe.] All agreed on the principles proposed at Deauville: It is a multi-stakeholder ecosystem that complies with law. In this morning’s discussion, I was struck by the convergence: we all agree about remunerating copyright holders. [Selection effect. I favor copyright and remunerating rights holders, but not as the supreme or exclusive value.] We agree that there are more legal alternatives. We agree that the law needs to be enforced. No one argued with that. [At what cost?] And we all agree we need international cooperation, especially to fight piracy.

Now Robert Darnton, Harvard Librarian, gives an invited talk about the history of copyright.

Darnton: I am grateful to be here. And especially grateful you did not ask me to talk about the death of the book. The book is not dead. More books are being produced in print and online every year than in the previous year. This year, more than 1 million new books will be produced. China has doubled its production of books in the past ten years. Brazil has a booming book industry. Even old countries like the US find book production is increasing. We should not bemoan the death of the book.

Should we conclude that all is well in the world of books? Certainly not. Listen to the lamentations of authors, publishers, booksellers. They are clearly frightened and confused. The ground is shifting beneath their feet and they don’t know where to stake a claim. The pace of tech is terrifying. What took millennia, then centuries, then decades, now happens all the time. Homesteading in the new info ecology is made difficult by uncertainty about copyright and economics.

Throughout early modern Europe, publishing was dominated by guilds of booksellers and printers. Modern copyright did not exist, but booksellers accumulated privileges, which Condorcet objected to. These privileges (AKA patents) gave them the exclusive rights to reproduce texts, with the support of the state. The monarchy in the 17th century eliminated competitors, especially ones in the provinces, reinforcing the guild, thus gaining control of publishing. But illegal production throve. Avignon was a great center of privacy in the 18th century because it was not French. It was surrounded by police intercepting the illegal books. It took a revolution to break the hegemony of the Parisian guild. For two years after the Bastille, the French press enjoyed liberty. Condorcet and others had argued for the abolition of constraints on the free exchange of ideas. It was a utopian vision that didn’t last long.

Modern copyright began with the 1793 French copyright law that established a new model in Europe. The exclusive right to sell a text was limited to the author for lifetime + 10 years. Meanwhile, the British Statute of Anne in 1710 created copyright. Background: The stationers’ monopoly required booksellers — and all had to be members — to register. The oligarchs of the guild crushed their competitors through monopolies. They were so powerful that they provoked results even within the book trade. Parliament rejected the guild’s attempt to secure the licensing act in 1695. The British celebrate this as the beginning of the end of pre-publication censorship.

The booksellers lobbied for the modern concept of copyright. For new works: 14 years, renewable once. At its origin, copyright law tried to strike a balance between the public good and the private benefit of the copyright owner. According to a liberal view, Parliament got the balance right. But the publishers refused to comply, invoking a general principle inherent in common law: When an author creates work, he acquires an unlimited right to profit from his labor. If he sold it, the publisher owned it in perpetuity. This was Diderot’s position. The same argument occurred in France and England.

In England, the argument culminated in a 1774 Donaldson vs. Beckett that reaffirmed 14 years renewable once. Then we Americans followed in our Constitution and in the first copyright law in 1790 (“An act for the encouragement of learning”, echoing the British 1710 Act): 14 years renewable once.

The debate is still alive. The 1998 copyright extension act in the US was considerably shaped by Jack Valenti and the Hollywood lobby. It extended copyright to life + 70 (or for corporations: life + 95). We are thus putting most literature out of the public domain and into copyright that seems perpetual. Valenti was asked if he favored perpetual copyright and said “No. Copyright should last forever minus one day.”

This history is meant to emphasize the interplay of two elements that go right through the copyright debate: A principle directed toward the public gain vs. self-interest for private gain. It would be wrong-headed and naive to only assert the former. B ut to assert only the latter would be cynical. So, do we have the balance right today?

Consider knowledge and power. We all agree that patents help, but no one would want the knowledge of DNA to be exploited as private property. The privitization of knowledge has become an enclosure movement. Consider academic periodicals. Most knowledge first appears in digitized periodicals. The journal article is the principle outlet for the sciences, law, philosophy, etc. Journal publishers therefore control access to most of the knowledge being created, and they charge a fortune. The price of academic journals rose ten times faster than the rate of inflation in the 1990s. The J of Comparative Neurology is $29,113/year. The Brain costs $23,000. The average list price in chemistry is over $3,000. Most of the research was subsidized by tax payers. It belongs in the public domain. But commercial publishers have fenced off parts of that domain and exploited it. Their profit margins runs as high as 40%. Why aren’t they constrained by the laws of supply and domain? Because they have crowded competitors out, and the demand is not elastic: Research libraries cannot cancel their subscriptions without an uproar from the faculty. Of course, professors and students produced the research and provided it for free to the publishers. Academics are therefore complicit. They advance their prestige by publishing in journals, but they fail to understand the damage they’re doing to the Republic of Letters.

How to reverse this trend? Open access journals. Journals that are subsidized at the production end and are made free to consumers. They get more readers, too, which is not surprising since search engines index them and it’s easy for readers to get to them. Open Access is easy access, and the ease has economic consequences. Doctors, journalists, researchers, housewives, nearly everyone wants information fast and costless. Open Access is the answer. It is a little simple, but it’s the direction we have to take to address this problem at least in academic journals.

But the Forum is thinking about other things. I admire Google for its technical prowess, but also because it demonstrated that free access to info can be profitable. But it ran into problems when it began to digitize books and make them available. It got sued for alleged breach of copyright. It tried to settle by turning it into a gigantic business and sharing the profits with the authors and publishers who sued them. Libraries had provided the books. Now they’d have to buy them back at a price set by Google. Google was fencing off access to knowledge. A federal judge rejected it because, among other points, it threatened to create a monopoly. By controlling access to books, Google occupied a position similar to that of the guilds in London and Paris.

So why not create a library as great as anything imagined by Google, but that would make works available to users free of charge? Harvard held a workshop on Oct. 1 2010 to explore this. Like Condorcet, a utopian fantasy? But it turns out to be eminently reasonable. A steering committee, a secretariat, 6 workgroups were established. A year later we launched the Digital Public Library of America at a conference hosted by the major cultural institutions in DC, and in April in 2013 we’ll have a preliminary version of it.

Let me emphasize two points. 1. The DPLA will serve a wide an varied constituency throughout the US. It will be a force in education, and will provide a stimulus to the economy by putting knowledge to work. 2. It will spread to everyone on the globe. The DPLA’s technical infrastructure is being designed to be interoperable with Europeana, which is aggregating the digital collections of 27 companies. National digital libraries are sprouting up everywhere, even Mongolia. We need to bring them together. Books have never respected boundaries. Within a few decades, we’ll have worldwide access to all the books in the world, and images, recordings, films, etc.

Of course a lot remains to be done. But, the book is dead? Long live the book!

Q: It is patronizing to think that the USA and Europe will set the policy here. India and China will set this policy.

A: We need international collaboration. And we need an infrastructure that is interoperable.

1 Comment »

