November 4, 2014

How to autograph an e-book

I went to see To Be Takei last night, and George himself was there for an interview afterwards. It occurred to me that I’d like him to autograph his book Oh Myyy, but I only have a copy on my Kindle.

So, here’s a proposal for the Kindle, the Nook, and for any other DRM-ed ebook reader: Allow us to embed one and only one photo into our copy of an ebook. That photo can never be replaced. It can be deleted, but then the slot is gone forever. This could be implemented as a special one-time-only annotation, and it would be managed by your fearsome machinery of control.

That way, I could take a selfie with George, post it into my Kindle copy of his book, and have the digital equivalent of an autographed copy.

I don’t see a way of doing this for open access e-books. Stupid open access e-books what with their “Oooh look everyone can read me!” smirks and their “Now everyone can learn and participate in culture” attitudes.

PS: To Be Takei was really enjoyable. Totally worth seeing, especially with an appreciative crowd.

October 31, 2014

R paragraphs 2 long?

Over the years as I’ve edited my own writing, I’ve come to rely on two heuristics: 1. Most paragraphs are better off without a topic sentence. 2. The ends of paragraphs sometimes make better beginnings.

My obvious hypothesis is that the Web has made us impatient readers who won’t wait to get to the end of the paragraph to decide whether the paragraph is worth reading. That’s true for me, anyway. Thorough reading takes more of an act of will than I remember.

TL;DR: Paragraphs are obsolete. Skip to the TL;DR.


October 29, 2014

Louis Menand, say what???

Can someone help me understand how Louis Menand sets up his Oct. 20 piece on copyright in the New Yorker? Menand’s a great writer, and the piece has gone through the NYer’s famous editorial process, so I am confident that it’s my fault that I am stuck staring at a couple of paragraphs not understanding what he’s talking about. I expect to be slapping my forehead momentarily.

Let me tell you why this matters to me, beyond my high expectations for New Yorker writing. When the New Yorker takes the Internet as its subject, it tends to be in the Traditional Resistant camp — although I acknowledge that this may well be just my observer’s bias. Their writers acknowledge the importance of the Net and nod at the good it does, but then with some frequency focus on the negative side, or the over-inflated side. Of course that’s fine. They’ve got some great writers. And Menand is not taking that side in this article. But if Menand’s description of how the Web works is as wildly wrong as it seems to me to be, then it raises some special concerns. If the New Yorker can’t get these basics right, then we have further to go than I’d thought. (Keep in mind that I am not all confident in how I’m reading this passage in the Menand article.)

So, Menand begins by imagining that an anthology called “Most Thoughtful Essays” includes his essay without his permission. Then he asks us to…

…suppose that a Web site,, ran an item that said something like “This piece on copyright is a great read!” with a hyperlink on the word “piece” to my article’s page on The New Yorker’s Web site. You wouldn’t think this was banditry at all. You would find it unexceptionable.

Some courts have questioned the use of links that import content from another Web site without changing the URL, a practice known as “framing.” But it’s hard to see much difference. Either way, when you’re reading a linked page, you may still be “at”, as clicking the back button on your browser can instantly confirm. Effectively, has stolen content from, just as the compiler of “Most Thoughtful Essays” stole content from me. The folks at and their V. C. backers are attracting traffic to their Web site, with its many banner ads for awesome stuff, using material created by other people.

When he says “it’s hard to see much difference,” the two cases seem to be including a hyperlink “to my article’s page on the NYer’s Web site” and embedding the entire article at their site in an iframe. But in the first case (clicking on the normal link) you are taken to and are not on

Even more confusing, when you’re now at, clicking the back button will confirm that you were in fact not at, for the page will change from to And, if has embedded Menand’s article via an iframe, clicking on the back button will take you to whatever page you were at before awesomestuff, thus proving nothing.

Finally, since the point of all this is to show us how linking is equivalent to printing Menand’s article in a paper anthology without his permission, it’s weird that Menand leaves out what is by far the most common case that might be equivalent: when a page neither links to another page nor uses an iframe to embed its content, but simply copies and pastes from another site.

So, as far as I can tell, the most coherent way of taking the words that Menand has written — and he’s a precise writer — contradicts the most basic experience of the Web: clicking on a link and going to a new page.

So where am I going wrong in reading him???

By the way, the rest of the article provides a good general overview of the copyright question, and is sympathetic to the reformist sensibility, although it is surprisingly primer-like for a NYer article. IMO, natch.


October 28, 2014

Paul McCartney’s end of the end

I’ve transferred my Google Play Music from one account to another (because of something I’ll explain in a post coming soon) and have found in it some albums I don’t own, have never heard of, and sometimes from singers I never heard of. No, no extra U2. Plus, some of the names of singers whose albums I do own have been mangled: Amanda Palma is sonorous, although I personally prefer Amander Palmer.

Anyway, one lagniappe I appreciated was a Paul McCartney album I’d missed. I still find it hard to listen to The Beatles without being overwhelmed: awe at their genius, longing for my youth, depression at how badly I and my generation failed you, regret for who I was then and what I am now. You know, the whole lifelong shitteroo. (Christ, get me some chocolates!) But Paul’s solo albums I can listen to without being overwhelmed. If I like half the songs, it’s a good album.

So, this morning I listened for the first time to McCartney’s Memory Almost Full (2007), which had unexpectedly materialized in my Google Play collection. As the title implies, it’s mainly about looking down as you near the peak of Mt. Old. The excellent Wikipedia article tells me that it was a Top Five album, went gold, and was Grammy-nominated. Apparently I have not been paying sufficient attention.

His song “End of the End” has some lovely lyrics, although I prefer the verses to the chorus. Here’s one of each:

On the day that I die I’d like bells to be rung
And songs that were sung to be hung out like blankets
That lovers have played on
And laid on while listening to songs that were sung

At the end of the end
It’s the start of a journey
To a much better place
And a much better place
Would have to be special
No reason to cry
No need to be sad
At the end of the end

The line “like blankets that lovers have played on and laid on while listening to songs that were sung” makes me glad that Paul knows what his music has meant to some of us. And I like the wrapping of the metaphor — “songs that were sung … while listening to songs that were sung.”

The slightly sappy chorus nevertheless makes me glad Paul appreciates the sweetness of his life, even though I’m not much convinced that any of us are going anywhere at the end of the end.

But when someone says about their impending death “Don’t be sad. I had a full life,” or whatever, they’re acting as if their death only happens to them. We may not be sad for you, but how about for us? It’s not all about you, you know! Though I do have to acknowledge that in this case most of it is.

Furthermore, the idea that we’ll “always have them in our hearts,” is not consolation. It’s what we need consolation for.

Where are those chocolates already?

October 27, 2014

[liveblog] Christine Borgmann

Christine Borgman, chair of Info Studies at UCLA, and author of the essential Scholarship in the Digital Age, is giving a talk on The Knowledge Infrastructure of Astronomy. Her new book is Big Data, Little Data, No Data: Scholarship in the Networked World, but you’ll have to wait until January. (And please note that precisely because this is a well-organized talk with clearly marked sections, it comes across as choppy in these notes.)

NOTE: Live-blogging. Getting things wrong. Missing points.Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Her new book draws on 15 yrs of studying various disciplines and 7-8 years focusing on astronomy as a discipline. It’s framed around the change to more data-intensive research across the sciences and humanities plus, the policy push for open access to content and to data. (The team site.)

They’ve been looking at four groups:

The world thinks that astronomy and genomics have figured out how to do data intensive science, she says. But scientists in these groups know that it’s not that straightforward. Christine’s group is trying to learn from these groups and help them learn from one another

Knowledge Infrastructures are “string and baling wire.” Pieces pulled together. The new layered on top of the old.

The first English scientific journal began almost 350 yrs ago. (Philosophical Transactions of the Royal Academy.) We no longer think of the research object as a journal but as a set of articles, objects, and data. People don’t have a simple answer to what is their data. The raw files? The tables of data? When they’re told to share their data, they’re not sure what data is meant.”Even in astronomy we don’t have a single, crisp idea of what are our data.”

It’s very hard to find and organize all the archives of data. Even establishing a chronology is difficult. E.g., “Yes, that project has that date stamp but it’s really a transfer from a prior project twenty years older than that.” It’s hard to map the pieces.

Seamless Astronomy: ADS All Sky Survey, mapping data onto the sky. Also, they’re trying to integrate various link mappings, e.g., Chandra, NED, Simbad, WorldWide Telescope,, Visier, Aladin. But mapping these collections doesn’t tell you why they’re being linked, what they have in common, or what are their differences. What kind of science is being accomplished by making those relationships? Christine hopes her project will help explain this, although not everyone will agree with the explanations.

Her group wants to draw some maps and models: “A Christmas Tree of Links!” She shows a variety of maps, possible ways of organizing the field. E.g., one from 5 yrs ago clusters services, repositories, archives and publishers. Another scheme: Publications, Objects, Observations; the connection between pubs (citations) and observations is the most loosely coupled. “The trend we’re seeing is that astronomy is making considerable progress in tying together the observations, publications, and data.” “Within astronomy, you’ve built many more pieces of your infrastructure than any other field we’ve looked at.”

She calls out Chris Erdmann [sitting immediately in front of me] as a leader in trying to get data curation and custodianship taken up by libraries. Others are worrying about bit-rot and other issues.

Astronomy is committed to open access, but the resource commitments are uneven.

Strengths of astronomy:

  • collaboration and openness.

  • International coordination.

  • Long term value of data.

  • Agreed standards.

  • Shared resources.

Gaps of astronomy:

  • Investment in data sstewardship: varies by mission and by type of research. E.g., space-based missions get more investment than the ground-based ones. (An audience member says that that’s because the space research was so expensive that there was more insistence on making the data public and usable. A lively discussion ensues…)

  • The access to data varies.

  • Curation of tools and technologies

  • International coordination. Sould we curate existing data? But you don’t get funding for using existing data. So, invest in getting new data from new instruments??

Christine ends with some provocative questions about openness. What does it mean exactly? What does it get us?


Q: As soon as you move out of the Solar System to celestial astronomy, all the standards change.

A: When it takes ten years to build an instrument, it forces you to make early decisions about standards. But when you’re deploying sensors in lakes, you don’t always note that this is #127 that Eric put the tinfoil on top of because it wasn’t working well. Or people use Google Docs and don’t even label the rows and columns because all the readers know what they mean. That makes going back to it is much harder. “Making it useful for yourself is hard enough.” It’s harder still to make it useful for someone in 5 yrs, and harder still to make it useful for an unknown scientist in another country speaking another language and maybe from another discipline.

Q: You have to put a data management plan into every proposal, but you can’t make it a budget item… [There is a lively discussion of which funders reasonably fund this]

Q: Why does Europe fund ground-based data better than the US does?

A: [audience] Because of Riccardo Giacconi.

A: [Christine] We need to better fund the invisible workforce that makes science work. We’re trying to cast a light on this invisible infrastructure.

October 24, 2014

[clickbait] Copyright is sodomy

A year ago, Harold Feld posted one of the most powerful ways of framing our excessive zeal for copyright that I have ever read. I was welling up even before he brought Aaron Swartz into the context.

Harold’s post is within a standard Jewish genre: the d’var Torah, an explanation of a point in the portion of the Torah being read that week. As is expected of the genre, he draws upon a long, self-reflective history of interpretation. I urge you to read it because of the light it sheds on our culture of copyright, but it’s also worth noticing the form of the discussion.

The content: In the Jewish tradition, Sodom’s sin wasn’t sexual but rather an excessive possessiveness leading to a fanatical unwillingness to share. Harold cites from a collection of traditional commentary, The Ethics of Our Fathers:

“There are four types of moral character. One who says: ‘what is mine is mine and what is yours is yours.’ This is an average person. Some say it is the Way of Sodom. The one who says: ‘what is mine is yours and what is yours is mine,’ is ignorant of the world. ‘What is mine is yours and what is yours is yours’ is the righteous. ‘What is mine is mine and what is yours is mine’ is the wicked.”

In a PowerPoint, it’d be a 2×2 chart. Harold’s point will be that the ‘what is mine is mine and what is yours is yours.’ of the average person becomes wicked when enforced without compassion or flexibility. Harold evokes the traditional Jewish examples of Sodom’s wickedness and compares them to what’s become our dominant “average” assumptions about how copyright ought to work.

I am purposefully not explaining any further. Read Harold’s piece.

The form: I find the space of explanation within which this d’var Torah — and most others that I’ve heard — operates to be fascinating. At the heart of Harold’s essay is a text accepted by believers as having been given by God, yet the explanation is accomplished by reference to a history of human interpretations that disagree with one another, with guidance by a set of values (e.g., sharing is good) that persevere in a community thanks to that community’s insistent adherence to its tradition. The result is that an agnostic atheist like me (I’m only pretty sure there is no God) can find truth and wisdom in the interpretation of a text I take as being ungrounded in a divine act.

But forget all that. Read Harold’s post, bubbelah.


October 23, 2014

Pieceful Collaboration

I gave a talk last night at the BookBuilders of Boston collaboration awards. It’s a non-profit that since 1937 has networked publishers, book manufacturers, and other book folk…although I don’t think people would have described it as “networking” back then. The nominees each gave a 2.5 minute presentation on their collaborative publishing project, many of which were very cool. Plus it was in the Brattle Theater.

I was the filler as the judges went into a sealed room to decide on the winners. So I gave a 30 talk pitched around a pun that I sort of like: a pieceful difference.

The idea was that lots of collaborative efforts bring together multiple people to build a single object — a barn raising or a Wikipedia page. But other collaborations break something apart and allow different people to build different things.

The ability to bring strangers together around a project is a gift of the Net. But so is its making available lots of little pieces that can be made into mosaics by a mosaic of people. The Johnny Cash Project is one sort of example. But so is any set of things created from stuff retrieved through an API or mashed-up APIs.

I’m not sure why I am drawn to pieceful collaboration, other than because of the cheap pun. I guess I like the way individuality is maintained around a shared but differentiated set of materials. I’m a little surprised. I thought I was less of an individualist than that.

October 16, 2014

What we could do with a gigabit

Here’s the start of a piece I posted at Medium about one thing we might do with a gigabit connection.

It’s 2017 and this year’s riot is in San Diego. It involves pandas, profit-driven zoo executives, and a Weight Watchers sponsorship. Doesn’t matter. People are massing in the streets and it’s heading toward a confrontation.

You first hear about this on Twitter. The embedded link takes you to FlyEye, a site that is unrelated to whatever sites and companies own trademarks like it in 2014. (Stand down, lawyers! This is all made up!)

Thankfully, San Diego in 2017 provides gigabit connectivity. In fact, the entire nation has gigabit, thanks to a personal appearance by Jesus H. Christ in the Comcast headquarters in late 2015.

At the FlyEye site you scan a huge video wall that shows you a feed from every person out in the streets who is sporting a meshed GoPro or Google Glass wearable video camera. Thousands of them. All 4K, of course.

Read the rest here.

October 13, 2014

Library as starting point

A new report on Ithaka S+R‘s annual survey of libraries suggests that library directors are committed to libraries being the starting place for their users’ research, but that the users are not in agreement. This calls into question the expenditures libraries make to achieve that goal. (Hat tip to Carl Straumsheim and Peter Suber.)

The question is good. My own opinion is that libraries should let Google do what it’s good at, while they focus on what they’re good at. And libraries are very good indeed at particular ways of discovery. The goal should be to get the mix right, not to make sure that libraries are the starting point for their communities’ research.

The Ithaka S+R survey found that “The vast majority of the academic library directors…continued to agree strongly with the statement: ‘It is strategically important that my library be seen by its users as the first place they go to discover scholarly content.'” But the survey showed that only about half think that that’s happening. This gap can be taken as room for improvement, or as a sign that the aspiration is wrongheaded.

The survey confirms that many libraries have responded to this by moving to a single-search-box strategy, mimicking Google. You just type in a couple of words about what you’re looking for and it searches across every type of item and every type of system for managing those items: images, archival files, books, maps, museum artifacts, faculty biographies, syllabi, databases, biological specimens… Just like Google. That’s the dream, anyway.

I am not sold on it. Roger cites Lorcan Dempsey, who is always worth listening to:

Lorcan Dempsey has been outspoken in emphasizing that much of “discovery happens elsewhere” relative to the academic library, and that libraries should assume a more “inside-out” posture in which they attempt to reveal more effectively their distinctive institutional assets.

Yes. There’s no reason to think that libraries are going to be as good at indexing diverse materials as Google et al. are. So, libraries should make it easier for the search engines to do their job. Library platforms can help. So can as a way of enriching HTML pages about library items so that the search engines can easily recognize the library item metadata.

But assuming that libraries shouldn’t outsource all of their users’ searches, then what would best serve their communities? This is especially complicated since the survey reveals that preference for the library web site vs. the open Web varies based on just about everything: institution, discipline, role, experience, and whether you’re exploring something new or keeping up with your field. This leads Roger to provocatively ask:

While academic communities are understood as institutionally affiliated, what would it entail to think about the discovery needs of users throughout their lifecycle? And what would it mean to think about all the different search boxes and user login screens across publishes [sic] and platforms as somehow connected, rather than as now almost entirely fragmented? …Libraries might find that a less institutionally-driven approach to their discovery role would counterintuitively make their contributions more relevant.

I’m not sure I agree, in part because I’m not entirely sure what Roger is suggesting. If it’s that libraries should offer an experience that integrates all the sources scholars consult throughout the lifecycle of their projects or themselves, then, I’d be happy to see experiments, but I’m skeptical. Libraries generally have not shown themselves to be particularly adept at creating grand, innovative online user experiences. And why should they be? It’s a skill rarely exhibited anywhere on the Web.

If designing great Web experiences is not a traditional strength of research libraries, the networked expertise of their communities is. So is the library’s uncompromised commitment to serving its community’s interests. A discovery system that learns from its community can do something that Google cannot: it can find connections that the community has discerned, and it can return results that are particularly relevant to that community. (It can make those connections available to the search engines also.)

This is one of the principles behind the Stacklife project that came out of the Harvard Library Innovation Lab that until recently I co-directed. It’s one of the principles of the Harvard LibraryCloud platform that makes Stacklife possible. It’s one of the reasons I’ve been touting a technically dumb cross-library measure of usage. These are all straightforward ways to start to record and use information about the items the community has voted for with its library cards.

It is by far just the start. Anonymization and opt-in could provide rich sets of connections and patterns of usage. Imagine we could know what works librarians recommend in response to questions. Imagine if we knew which works were being clustered around which topics in lib guides and syllabi. (Support the Open Syllabus Project!) Imagine if we knew which books were being put on lists by faculty and students. Imagine if knew what books were on participating faculty members’ shelves. Imagine we could learn which works the community thinks are awesome. Imagine if we could do this across institutions so that communities could learn from one another. Imagine we could do this with data structures that support wildly messily linked sources, many of them within the library but many of them outside of it. (Support Linked Data!)

Let the Googles and Bings do what they do better than any sane person could have imagined twenty years ago. Let libraries do what they have been doing better than anyone else for centuries: supporting and learning from networked communities of scholars, librarians, and students who together are a profound source of wisdom and working insight.

October 8, 2014

A dumb idea for opening up library usage data

A dumb idea, but its dumbness is its virtue.

The idea is that libraries that want to make data about how relevant items are to their communities could algorithmically assign a number between 1-100 to those items. This number would present a very low risk of re-identification, would be easily compared across libraries, and would give local libraries control over how they interpret relevance.

I explain this idea in a post at The Chronicle of Higher Ed

1 Comment »

