Joho the Blog » interop

February 1, 2014

Linked Data for Libraries: And we’re off!

I’m just out of the first meeting of the three universities participating in a Mellon grant — Cornell, Harvard, and Stanford, with Cornell as the grant instigator and leader — to build, demonstrate, and model using library resources expressed as Linked Data as a tool for researchers, student, teachers, and librarians. (Note that I’m putting all this in my own language, and I was certainly the least knowledgeable person in the room. Don’t get angry at anyone else for my mistakes.)

This first meeting, two days long, was very encouraging indeed: it’s a superb set of people, we are starting out on the same page in terms of values and principles, and we enjoyed working with one another.

The project is named Linked Data for Libraries (LD4L) (minimal home page), although that doesn’t entirely capture it, for the actual beneficiaries of it will not be libraries but scholarly communities taken in their broadest sense. The idea is to help libraries make progress with expressing what they know in Linked Data form so that their communities can find more of it, see more relationships, and contribute more of what the communities learn back into the library. Linked Data is not only good at expressing rich relations, it makes it far easier to update the dataset with relationships that had not been anticipated. This project aims at helping libraries continuously enrich the data they provide, and making it easier for people outside of libraries — including application developers and managers of other Web sites — to connect to that data.

As the grant proposal promised, we will use existing ontologies, adapting them only when necessary. We do expect to be working on an ontology for library usage data of various sorts, an area in which the Harvard Library Innovation Lab has done some work, so that’s very exciting. But overall this is the opposite of an attempt to come up with new ontologies. Thank God. Instead, the focus is on coming up with implementations at all three universities that can serve as learning models, and that demonstrate the value of having interoperable sets of Linked Data across three institutions. We are particularly focused on showing the value of the high-quality resources that libraries provide.

There was a great deal of emphasis in the past two days on partnerships and collaboration. And there was none of the “We’ll show ‘em where they got it wrong, by gum!” attitude that in my experience all too often infects discussions on the pioneering edge of standards. So, I just got to spend two days with brilliant library technologists who are eager to show how a new generation of tech, architecture, and thought can amplify the already immense value of libraries.

There will be more coming about this effort soon. I am obviously not a source for tech info; that will come soon and from elsewhere.

2 Comments »

June 21, 2013

[lodlam] Kevin Ford on the state of BIBFRAME

Kevin Ford who is a principle member of the team behind the Library of Congress’ BIBFRAME effort — a modern replacement for the aging MARC standard — gives an update on its status, and addresses a controversy about whether it’s “webby” enough. (I liveblogged a session about this at LODLAM.)

3 Comments »

[lodlam] Kitio Fofack on why Linked Data

Kitio Fofack turned to Linked Data when creating a prototype app that aggregated researcher events. He explains why.

Be the first to comment »

March 28, 2013

[annotation][2b2k] Critique^it

Ashley Bradford of Critique-It describes his company’s way of keeping review and feedback engaging.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

To what extent can and should we allow classroom feedback to be available in the public sphere? The classroom is a type of Habermasian civic society. Owning one’s discourse in that environment is critical. It has to feel human if students are to learn.

So, you can embed text, audio, and video feedback in documents, video and images. It translates docs into HTML. To make the feedback feel human, it uses slightly stamps. You can also type in comments, marking them as neutral, positive, or critique. A “critique panel” follows you through the doc as you read it, so you don’t have to scroll around. It rolls up comments and stats for the student or the faculty.

It works the same in different doc types, including Powerpoint, images, and video.

Critiques can be shared among groups. Groups can be arbitrarily defined.

It uses HTML 5. It’s written in Javascript, PHP, and uses Mysql.

“We’re starting with an environment. We’re building out tools.” Ashley aims for Critique^It to feel very human.

2 Comments »

[annotation][2b2k] Mediathread

Jonah Bossewich and Mark Philipsonfrom Columbia University talk about Mediathread, an open source project that makes it easy to annotate various digital sources. It’s used in many courses at Columbi, as well as around the world.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

It comes from Columbia’s Center for New Media Teaching and Learning. It began with Vital, a video library tool. It let students clip and save portions of videos, and comment on them. Mediathread connects annotations to sources by bookmarking, via a bookmarklet that interoperates with a variety of collections. The bookmarklet scrapes the metadata because “We couldn’t wait for the standards to be developed.” Once an item is in Mediathread, it embeds the metadata as well.

It has always been conceived of a “small-group sharing and collaboration space.” It’s designed for classes. You can only see the annotations by people in your class. It does item-level annotation, as well as regions.

Mediathread connects assignments and responses, as well as other workflows. [He's talking quickly :)]

Mediathread’s bookmarklet approach requires it to have to accommodate the particularities of sites. They are aiming at making the annotations interoperable in standard forms.

Be the first to comment »

[annotation][2b2k] Phil Desenne on Harvard annotation tools

Phil Desenne begins with a brief history of annotation tools at Harvard. There are a lot, for annotating from everything to texts to scrolls to music scores to video. Most of them are collaborative tools. The collaborative tool has gone from Adobe AIR to Harvard iSites, to open source HTML 5. “It’s been a wonderful experience.” It’s been picked up by groups in Mexico, South America and Europe.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Phil works on edX. “We’re beginning to introduce annotation into edX.” It’s being used to encourage close reading. “It’s the beginning of a new way of thinking about teaching and assessing students.” Students tag the text, which “is the beginning of a semantic tagging system…Eventually we want to create a semantic ontology.”

What are the implications for the “MOOC Generation”? MOOC students are out finding information anywhere they can. They stick within a single learning management system (LMS). LMS’s usually have commentary tools “but none of them talk with one another . Even within the same LMS you don’t have cross-referencing of the content.” We should have an interoperable layer that rides on top of the LMS’s.

Within edX, there are discussions within classes, courses, tutorials, etc. These should be aggregated so that the conversations can reach across the entire space, and, of course, outside of it. edX is now working on annotation systems that will do this. E.g., imagine being able to discuss a particular image or fragments of videos, and being able to insert images into streams of commentary. Plus analytics of these interations. Heatmaps of activity. And a student should be able to aggregate all her notes, journal-like, so they can be exported, saved, and commented on, “We’re talking about a persistent annotation layer with API access.” “We want to go there.”

For this we need stable repositories. They’ll use URNs.

Be the first to comment »

[annotation][2b2k] Paolo Ciccarese on the Domeo annotation platform

Paolo Ciccarese begins by reminding us just how vast the scientific literature is. We can’t possibly read everything we should. But “science is social” so we rely on each other, and build on each other’s work. “Everything we do now is connected.”

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Today’s media do provide links, but not enough. Things are so deeply linked. “How do we keep track of it?” How do we communicate with others so that when they read the same paper they get a little bit of our mental model, and see why we found the article interesting?

Paolo’s project — Domeo [twitter:DomeoTool] — is a web app for “producing, browsing, and sharing manual and semi-automatic (structure and unstructured) annotations, using open standards. Domeo shows you an article and lets you annotate fragments. You can attach a tag or an unstructured comment. The tag can be defined by the user or by a defined ontology. Domeo doesn’t care which ontologies you use, which means you could use it for annotating recipes as well as science articles.

Domeo also enables discussions; it has a threaded msg facility. You can also run text mining and entity recognition systems (Calais, etc.) that automatically annotates the work with those words, which helps with search, understanding, and curation. This too can be a social process. Domeo lets you keep the annotation private or share it with colleagues, groups, communities, or the Web. Also, Domeo can be extended. In one example, it produces information about experiments that can be put into a database where it can be searched and linked up with other experiments and articles. Another example: “hypothesis management” lets readers add metadata to pick out the assertions and the evidence. (It uses RDF) You can visualize the network of knowledge.

It supports open APIs for integrating with other systems., including into the Neuroscience Information Framework and Drupal. “Domeo is a platform.” It aims at supporting rich source, and will add the ability to follow authors and topics, etc., and enabling mashups.

Be the first to comment »

[annotation][2b2k] Neel Smith: Scholarly annotation + Homer

Neel Smith of Holy Cross is talking about the Homer Multitext project, a “long term project to represent the transmission of the Iliad in digital form.”

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

He shows the oldest extant ms of the Iliad, which includes 10th century notes. “The medieval scribes create a wonderful hypermedia” work.

“Scholarly annotation starts with citation.” He says we have a good standard: URNs, which can point to, for example, and ISBN number. His project uses URNs to refer to texts in a FRBR-like hierarchy [works at various levels of abstraction]. These are semantically rich and machine-actionable. You can google URN and get the object. You can put a URN into a URL for direct Web access. You can embed an image into a Web page via its URN [using a service, I believe].

An annotation is an association. In a scholarly notation, it’s associated with a citable entity. [He shows some great examples of the possibilities of cross linking and associating.]

The metadata is expressed as RDF triples. Within the Homer project, they’re inductively building up a schema of the complete graph [network of connections]. For end users, this means you can see everything associated with a particular URN. Building a facsimile browser, for example, becomes straightforward, mainly requiring the application of XSL and CSS to style it.

Another example: Mise en page: automated layout analysis. This in-progress project analyzes the layout of annotation info on the Homeric pages.

1 Comment »

[annotations][2b2k] Rob Sanderson on annotating digitized medieval manuscripts

Rob Sanderson [twitter:@azaroth42] of Los Alamos is talking about annotating Medieval manuscripts.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

He says many Medieval manuscripts are being digitized. The Mellon Foundation is funding many such projects. But these have tended to reinvent the same tech, and have not been designed for interoperability with other projects. So the Digital Medieval Initiative was founded, with a long list of prestigious partners. They thought about what they’d like: distributed, linked data, interoperable, etc. For this they need a shared description format.

The traditional approach is annotate an image of a page. But it can be very difficult to know which images to annotate; he gives as an example a page that has fold-outs. “The naive assuption is that an image equals a page.” But there may be fragments, or only portions of the page have been digitized (e.g., the illuminations), etc. There may be multiple images on a page, revealed by multi-spectral imaging. There may be multiple orientations of the page, etc.

The solution? The canvas paradigm. A canvas is an empty space corresponding to the rectangle (or whatever) of the page. You allow rich resources to be associated with it, and allow users to comment. For this, they use Open Annotation. You can specify a choice of images. You can associate text with an area of the canvas. There are lots of different ways to visualize those comments: overlays, side-by-side, etc.

You can build hybrid pages. For example, and old scan might have a new color scan of its illustrations pointing at it. Or you could have a recorded performance of a piece of music pointing at the musical notation.

In summary, the SharedCanvas model uses open standards (HTML 5, Open Annotation, TEI, etc.) and can be implement distributed across reporsitories, encouraging engagement by domain experts.

Be the first to comment »