Joho the Bloglinked data Archives - Page 2 of 2 - Joho the Blog

June 19, 2013

[lodlam] The state of Linked Data

Jon Voss, an organizer of the LODLAM conference in Montreal, talks about what we can learn about the current state of Linked Data for libraries, archives, and museums by looking at the topics proposed at this unconference:

1 Comment »

[lodlam] Topics

I’m at LODLAM (linked open data for libraries, archives, and museums) in Montreal. It’s an unconference with 100 people from 16 countries. Here are the topics being suggested at the opening session. (There will be more added to the agenda board.)

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

(Because this is an unconference, I probably will not be doing much more liveblogging.)

  • Taxonomy alignment

  • How to build a case for LOD

  • How to build a pattern library (a clear articulation for a problem, the context where the problem appears, and a pattern for its solution) for cultural linked open data

  • How to take PDF to the next level, integrating triples to make it open data? How to make it into a “portable data format”

  • How can we efficiently convert our data to LOD? USC has Karma and would like to convene a workshop about tools.

  • How to convert simple data to LOD? How to engage users in making that data better?

  • A cultural heritage standard.

  • User interfaces. What do we do after we create all of this data? [applause]

  • Progress since the prior LODLAM (in San Francisco)? BIBFRAME?

  • Preserving linked data

  • The NSA has built the ultimate linked data tool chain. What can we learn?

  • Internal use cases for linked data.

  • How to make use of dirty metadata

  • A draft ontology for MODS metadata (MODSRDF)

  • Collaborating on a harvesting/enrichment tool

  • Getty Vocabulary is being released as LOD [applause], but they need help building a community making sure they have the right ontologies, early adopters, etc.

  • The data exhaust from dSPACE and linking it to world problems — find the disconnects between the people who have problems and people with info helpful for those problems

  • Identities and authorities — linked data as an app-independent way of doing identity control and management

  • RDF cataloging interface

  • Curation and social relationships

  • Linked Open Data echo systems

  • A new understanding of search — ways LODers search isn’t familiar to most people


  • Open Annotation tools enabling end users to enrich the graph

  • Our collections are different for a reason. That manifests itself in the data structure. We should talk about this.

  • In the business writ large, maybe we need the confidence to be invisible. What does that mean?

  • Feedback loops once data has been exposed

  • Wikidata — the database that supports Wikipedia

  • Forming an international group to discuss archival data, particularly in LOD

  • Comments Off on [lodlam] Topics

October 16, 2012

[eim][semtechbiz] Enterprise Linked Data

David Wood of is talking about Callimachus, an open source project that is also available through his company. [NOTE: Liveblogging. All bets on accuracy are off.]

We’re moving from PCs to mobile, he says. This is rapidly changing the Internet. 51% of Internet traffic is non-human, he says (as of Feb 2012). 35hrs of video are uploaded to YouTube every minute. Traditionally we dealt with this type of demand via data warehousing: put it all in one place for easy access. But that’s not true: we never really got it all in one place accessible through one interface. Jeffrey Pollock says we should be talking not about data integration but interoperability because the latter implies a looser coupling.

He gives some use cases:

  • BBC wanted to have a Web presence for all of its 1500 broadcasts per day. They couldn’t do it manually. So, they decided to grab data from the linked open data data cloud and assemble the pages automatically. They hired fulltime editors to curate Wikipedia. RDF enabled them to assuemble the pages.

  • O’Reilly Media switched to RDF reluctantly but for purely pragmatic reasons.

  • BestBuy, too. They used RDFa to embed metadata into their pages to improve their SEO.

  • Elsevier uses Linked Data to manage their assets, from acquisition to delivery.

This is not science fiction, he says. It’s happening now.

Then two negative examples:

  • David says that Microsoft adopted RDF in the late 90s. But Netscape came out a portal tech based on RDF that scared Microsoft out of the standards effort. But they needed the tech, so they’ve reinvented it three times in proprietary ways.

  • Borders was too late in changing its tech.

Then he does a product pitch for Callimachus Enterperise: a content management system for enterprises.

Comments Off on [eim][semtechbiz] Enterprise Linked Data

July 3, 2012

[2b2k]The inevitable messiness of digital metadata

This is cross posted at the Harvard Digital Scholarship blog

Neil Jeffries, research and development manager at the Bodleian Libraries, has posted an excellent op-ed at Wikipedia Signpost about how to best represent scholarly knowledge in an imperfect world.

He sets out two basic assumptions: (1) Data has meaning only within context; (2) We are not going to agree on a single metadata standard. In fact, we could connect those two points: Contexts of meaning are so dependent on the discipline and the user's project and standpoint that it is unlikely that a single metadata standard could suffice. In any case, the proliferation of standards is simply a fact of life at this point.

Given those constraints, he asks, what's the best way to increase the interoperability of the knowledge and data that are accumulating on line at at pace that provokes extremes of anxiety and joy in equal measures? He sees a useful consensus emerging on three points: (a) There are some common and basic types of data across almost all aggregations. (b) There is increasing agreement that these data types have some simple, common properties that suffice to identify them and to give us humans an idea about whether we want to delve deeper. (c) Aggregations themselves are useful for organizing data, even when they are loose webs rather than tight hierarchies. 

Neil then proposes RDF and linked data as appropriate ways to capture the very important relationships among ideas, pointing to the Semantic MediaWiki as a model. But, he says, we need to capture additional metadata that qualifies the data, including who made the assertion, links to differences of scholarly opinion, omissions from the collection, and the quality of the evidence. "Rather than always aiming for objective statements of truth we need to realise that a large amount of knowledge is derived via inference from a limited and imperfect evidence base, especially in the humanities," he says. "Thus we should aim to accurately represent  the state of knowledge about a topic, including omissions, uncertainty and differences of opinion."

Neil's proposals have the strengths of acknowledging the imperfection of any attempt to represent knowledge, and of recognizing that the value of representing knowledge lies mainly in its getting linked it to its sources, its context, its controversies, and to other disciplines. It seems to me that such a system would not only have tremendous pragmatic advantages, for all its messiness and lack of coherence it is in fact a more accurate representation of knowledge than a system that is fully neatened up and nailed down. That is, messiness is not only the price we pay for scaling knowledge aggressively and collaboratively, it is a property of networked knowledge itself. 



« Previous Page