Joho the Blog » apis

November 24, 2014

[siu] Panel: Capturing the research lifecycle

It’s the first panel of the morning at Shaking It Up. Six men from six companies give brief overviews of their products. The session is led by Courtney Soderberg from the
Center for Open Science, which sounds great. [Six panelists means that I won’t be able to keep up. Or keep straight who is who, since there are no name plates. So, I’ll just distinguish them by referring to them as “Another White Guy,” ‘k?]

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Riffyn: “Manufacturing-grade quality in the R&D process.” This can easily double R&D productivity “because you stop missing those false negatives.” It starts with design

Github: “GitHub is a place where people do software development together.” 10M people. 15M software repositories. He points to Zenodo, a respository for research outputs. Open source communities are better at collaborating than most academic research communities are. The principles of open source can be applied to private projects as well. A key principle: everything has a URL. Also, the processes should be “lock-free” so they can be done in parallel and the decision about branching can be made later.

Texas Advanced Computing Center: Agave is a Science-as-a-Service platform. It’s a platform, that provides lots of services as well as APIs. “It’s SalesForce for science.”

CERN is partnering with GitHub. “GitHub meets Zenodo.” But it also exports the software into INSPIRE which links the paper with the software. [This
might be the INSPIRE he’s referring to. Sorry. I know I should know this.

Overleaf was inspired by etherpad, the collaborative editor. But Etherpad doesn’t do figures or equations. OverLeaf does that and much more.

Publiscize helps researchers translate their work into terms that a broader audience can understand. He sees three audiences: intradisciplinary, interdisciplinary, and the public. The site helps scientists create a version readable by the public, and helps them disseminate them through social networks.


Some white guys provided answers I couldn’t quite hear to questions I couldn’t hear. They all seem to favor openness, standards, users owning their own data, and interoperability.

[They turned on the PA, so now I can hear. Yay. I missed the first couple of questions.]

Github: Libraries have uploaded 100,000 open access books, all for free. “Expect the unexpected. That happens a lot.” “Academics have been among the most abusive of our platform…in the best possible way.”

Zenodo: The most unusual uses are the ones who want to instal a copy at their local institutions. “We’re happy to help them fork off Zenodo.”

Q: Where do you see physical libraries fitting in?

AWG: We keep track of some people’s libraries.

AWG: People sometimes accidentally delete their entire company’s repos. We can get it back for you easily if you do.

AWG: Zenodo works with Chris Erdmann at Harvard Library.

AWG: We work with FigShare and others.

AWG: We can provide standard templates for Overleaf so, for example, your grad students’ theses can be managed easily.

AWG: We don’t do anything particular with libraries, but libraries are great.

Courtney:We’re working with ARL on a shared notification system

Q: Mr. GitHub (Arfon Smith), you said in your comments that reproducibility is a workflow issue?

GitHub: You get reproducibility as a by-product of using tools like the ones represented on this panel. [The other panelists agree. Reproducibility should be just part of the infrastructure that you don’t have to think about.]


November 21, 2014

APIs are magic

(This is cross-posted at Medium.)

Dave Winer recalls a post of his from 2007 about an API that he’s now revived:

“Because Twitter has a public API that allows anyone to add a feature, and because the NY Times offers its content as a set of feeds, I was able to whip up a connection between the two in a few hours. That’s the power of open APIs.”

Ah, the power of APIs! They’re a deep magic that draws upon five skills of the Web as Mage:

First, an API matters typically because some organization has decided to flip the default: it assumes data should be public unless there’s a reason to keep it private.

Second, an API works because it provides a standard, or at least well-documented, way for an application to request that data.

Third, open APIs tend to be “RESTful,” which means that they work using the normal Web way of proceeding (i.e., Web protocols). All you or your program have to do is go to the API’s site using a standard URL of the sort you enter in a browser. The site comes back not with a Web page but with data. For example, click on this URL (or paste it into your browser) and you’ll get data from Wikipedia’s API: (This is from the Wikipedia API tutorial.)

Fourth, you need people anywhere on the planet who have ideas about how that data can be made more useful or delightful. (cf. Dave Winer.)

Fifth, you need a worldwide access system that makes the results of that work available to everyone on the Internet.

In short, API’s show the power of a connective infrastructure populated by ingenuity and generosity.

In shorter shortnesss: API’s embody the very best of the Web.

Be the first to comment »

October 24, 2013


The Emily Dickinson archive went online today. It’s a big deal not only because of the richness of the collection, and the excellent technical work by the Berkman Center, but also because it is a good sign for Open Access. Amherst, one of the major contributors, had open accessed its Dickinson material earlier, and now the Harvard University Press has open accessed some of its most valuable material. Well done!

The collection makes available in one place the great Dickinson collections held by Amherst, Harvard, and others. The metadata for the items is (inevitably) inconsistent in terms of its quantity, but the system has been tuned so that items with less metadata are not systematically overwhelmed by its search engine.

The Berkman folks tell me that they’re going to develop an open API. That will be extra special cool.

Be the first to comment »

June 11, 2012

DPLA West meeting online

The sessions from the DPLA Plenary meeting on April 27 in SF are now online. Here’s the official announcement:

…all media and work outputs from the two day-long events that made up DPLA West–the DPLA workstream meetings held on April 26, 2012 at the San Francisco Public Library, and the public plenary held on April 27, 2012 at the Internet Archive in San Francisco, CA–are now available online on the “DPLA West: Media and Outputs” page:

There you will find:

  • Key takeaways from the April 26, 2012 workstream meetings;

  • Notes from the April 27, 2012 Steering Committee meeting;

  • Complete video of the April 27, 2012 public plenary;

  • Photographs and graphic notes from the public plenary;

  • Video interviews with DPLA West participants;

  • And audio interviews with DPLA West scholarship recipients.

More information about DPLA West can be found online at

Folks from the Harvard Library Innovation Lab and the Berkman Center worked long and hard to create a prototype software platform for the DPLA in time for this event. The platform is up and gives live access to about 20M books and thousands of images and other items from various online collections. The session at which we introduced, explained, and demo’ed it is now available for your viewing pleasure. (I was interim head of the project.)

Be the first to comment »