Joho the Blog » Peter Suber on the 4-star openness rating
Everyday Chaos
Too Big to Know
Too Big to Know
Cluetrain 10th Anniversary edition
Cluetrain 10th Anniversary
Everything Is Miscellaneous
Everything Is Miscellaneous
Small Pieces cover
Small Pieces Loosely Joined
Cluetrain cover
Cluetrain Manifesto
My face
Speaker info
Who am I? (Blog Disclosure Form) Copy this link as RSS address Atom Feed

Peter Suber on the 4-star openness rating

One of the outcomes of the the LOD-LAM conference was a draft of an idea for a 4-star classification of openness of metadata from cultural institutions. The classification is nicely counter-intuitive, which is to say that it’s useful.

I asked Peter Suber, the Open Access guru, what he thought of it. He replied in an email:

First, I support the open knowledge definition and I support a star system to make it easy to refer to different degrees of openness.

* I’m not sure where this particular proposal comes from. But I recommend working with the Open Knowledge Foundation, which developed the open knowledge definition. The more key players who accept the resulting star system, the more widely it will be used.

* This draft overlooks some complexity in the 3-star entry and the 2-star entry. Currently it suggests that attribution through linking is always more open than attribution by other means (say, by naming without linking). But this is untrue. Sometimes one is more difficult than the other. In a given case, the easier one is more open by lowering the barrier to distribution.

If you or your software had both names and links for every datasource you wanted to attribute, then attribution by linking and attribution by naming would be about equal in difficulty and openness. But if you had names without links, then obtaining the links would be an extra burden that would delay or impede distribution.

The disparity in openness grows as the number of datasources increases. On this point, see the Protocol for Implementing Open Access Data (by John Wilbanks for Science Commons, December 2007).

Relevant excerpt: “[T]here is a problem of cascading attribution if attribution is required as part of a license approach. In a world of database integration and federation, attribution can easily cascade into a burden for scientists….Would a scientist need to attribute 40,000 data depositors in the event of a query across 40,000 data sets?” In the original context, Wilbanks uses this (cogently) as an argument for the public domain, or for shedding an attribution requirement. But in the present context, it complicates the ranking system. If you *did* have to attribute a result to 40,000 data sources, and if you had names but not links for many of those sources, then attribution by naming would be *much* easier than attribution by linking.

Solution? I wouldn’t use stars to distinguish methods of attribution. Make CC-BY (or the equivalent) the first entry after the public domain, and let it cover any and all methods of attribution. But then include an annotation explaining that some methods attribution increase the difficulty of distribution, and that increasing the difficulty will decrease openness. Unfortunately, however, we can’t generalize about which methods of attribution raise and lower this barrier, because it depends on what metadata the attributing scholar may already possess or have ready to hand.

* The overall implication is that anything less open than CC-BY-SA deserves zero stars. On the one hand, I don’t mind that, since I’d like to discourage anything less open than CC-BY-SA. On the other, while CC-BY-NC and CC-BY-ND are less open than CC-BY-SA, they’re more open than all-rights-reserved. If we wanted to recognize that in the star system, we’d need at least one more star to recognize more species.

I responded with a question: “WRT to your naming vs. linking comments: I assumed the idea was that it’s attribution-by-link vs. attribution-by-some-arbitrary-requirement. So, if I require you to attribute by sticking in a particular phrase or mark, I’m making it harder for you to just scoop up and republish my data: Your aggregating sw has to understand my rule, and you have to follow potentially 40,000 different rules if you’re aggregating from 40,000 different databases.

Peter responded:

You’re right that “if I require you to attribute by sticking in a particular phrase or mark, I’m making it harder for you to just scoop up and republish my data.” However, if I already have the phrases or marks, but not the URLs, then requiring me to attribute by linking would be the same sort of barrier. My point is that the easier path depends on which kinds of metadata we already have, or which kinds are easier for us to get. It’s not the case that one path is always easier than another.

But it might be the case that one path (attribution by linking) is *usually* easier than another. That raises a nice question: should that shifting, statistical difference be recognized with an extra star? I wouldn’t mind, provided we acknowledged the exceptions in an annotation.

Previous: « || Next: »

Leave a Reply

Comments (RSS).  RSS icon