Joho the Blogeim Archives - Joho the Blog

August 14, 2016

Coinstar's list of unacceptable items seems to have been written by Tim Burton

Coinstar makes vending machines into which you drop coins and from which you get bills or gift cards. Its list of unacceptable items is quite odd, presumably intentionally.

unacceptable items

I’d think that this is based on things people have actually tried to shove into Coinstar slots, except I don’t see “fishing line with gum at its end” or “your dick”on the list.

(Tip o’ the hat to my brother Andy who definitely was not trying to “redeem” 70,000 #6 steel washers.)

1 Comment »

March 8, 2016

Making library miscellaneousness awesome

Sitterwerk Art Library in St. Gallen, Switzerland, has 25,000 items on its shelves in no particular order. This video explains why that is a brilliant approach. And then the story just gets better and better.

Werkbank from Astrom / Zimmer on Vimeo.

That the shelves have no persistent order doesn’t mean they have no order. Rather, works are reshelved by users in the clusters the users have created for their research. All the items have RFID tags in them, and the shelves are automatically scanned so that the library can always tell users where items are located.

As a result, if you look up a particular item, you will see it surrounded by works that some other user thought were related to it in some way. This creates a richer browsing experience because it is shaped and reshaped by how its community of users sees the items’ inter-relationships.

The library has now installed Werkbank, which is a plain old table where you can spread out a pile of books and do your research. But, unlike truly plain old tables, this one combines RFID sensors and cameras with recognition software so it knows which works you’ve put on the table and how you’ve organized them. Werkbench notes those associations, and stores them, creating a rich network of related works.

It also lets the individual save a research set, and even compile a booklet documenting those items, with notes. It can be printed on the spot and taken home … or put into the shelves as a user-generated lib guide.

This is awesome.

Here’s a bit more about it:

…the new table sports a grid of 12 an­ten­nas. It also has two cam­eras at­tached: one for scan­ning the tab­letop and through cus­tom image re­cog­ni­tion soft­ware de­term­ine the exact po­s­i­tion and ro­ta­tion of books; one for mak­ing high-res­ol­u­tion scans of pages, notes or ob­jects not yet in the Sit­ter­werk cata­logue. Just like be­fore, the new server and its in­ter­face provides a real-time di­gital ren­der­ing of the table and its con­tents, but in two di­men­sions in­stead of one. It also lets you at­tach scans, pho­tos and texts to in­di­vidual ob­jects, and to the vir­tual table it­self. Once you save your col­lec­tion, it merges with a grow­ing net­work of other col­lec­tions, books, ma­ter­i­als, thoughts and people

Anthon Astrom tells me that the project currently runs against an internal API, and they are planning to create a public API at some point. That way, the world can benefit from what Sitterwerk’s users are teaching it.



At the Harvard Library Innovation Lab, we wanted to do something that touches on some elements of this. With 73 libraries and 13 million items in Harvard Library it never even crossed our minds to install continuous RFID scanners in the stacks. So, our StackLife project and the LibraryCloud platform underneath it wanted simply to record which books were checked out with others, on the grounds that those clusters often have meaning. But, Harvard cyber-security researchers warned that this could be used to identify who took the books out. We thought about ways of smudging the data, and about making it opt-in, but it was not a fight we could win at that point. Werkbank might have the same issues when recording clusters but because it’s an art library, there may be less concern about the government demanding to know who was researching The Scream, Delacroix’s Liberty Leading the People, and Guernica because that person is clearly up to no good.

In any case the Sitterwerk library and Werkbank have far exceeded our imagination. More than that: it’s real. Awesomely real.


February 1, 2014

Linked Data for Libraries: And we’re off!

I’m just out of the first meeting of the three universities participating in a Mellon grant — Cornell, Harvard, and Stanford, with Cornell as the grant instigator and leader — to build, demonstrate, and model using library resources expressed as Linked Data as a tool for researchers, student, teachers, and librarians. (Note that I’m putting all this in my own language, and I was certainly the least knowledgeable person in the room. Don’t get angry at anyone else for my mistakes.)

This first meeting, two days long, was very encouraging indeed: it’s a superb set of people, we are starting out on the same page in terms of values and principles, and we enjoyed working with one another.

The project is named Linked Data for Libraries (LD4L) (minimal home page), although that doesn’t entirely capture it, for the actual beneficiaries of it will not be libraries but scholarly communities taken in their broadest sense. The idea is to help libraries make progress with expressing what they know in Linked Data form so that their communities can find more of it, see more relationships, and contribute more of what the communities learn back into the library. Linked Data is not only good at expressing rich relations, it makes it far easier to update the dataset with relationships that had not been anticipated. This project aims at helping libraries continuously enrich the data they provide, and making it easier for people outside of libraries — including application developers and managers of other Web sites — to connect to that data.

As the grant proposal promised, we will use existing ontologies, adapting them only when necessary. We do expect to be working on an ontology for library usage data of various sorts, an area in which the Harvard Library Innovation Lab has done some work, so that’s very exciting. But overall this is the opposite of an attempt to come up with new ontologies. Thank God. Instead, the focus is on coming up with implementations at all three universities that can serve as learning models, and that demonstrate the value of having interoperable sets of Linked Data across three institutions. We are particularly focused on showing the value of the high-quality resources that libraries provide.

There was a great deal of emphasis in the past two days on partnerships and collaboration. And there was none of the “We’ll show ‘em where they got it wrong, by gum!” attitude that in my experience all too often infects discussions on the pioneering edge of standards. So, I just got to spend two days with brilliant library technologists who are eager to show how a new generation of tech, architecture, and thought can amplify the already immense value of libraries.

There will be more coming about this effort soon. I am obviously not a source for tech info; that will come soon and from elsewhere.


June 13, 2013

[eim][misc] Tagging rises

Both Facebook and Apple have announced the use of tags. Yay!

Tags have continued to percolate through the ecosystem after their most auspicious introduction in (Note the phrase “most auspicious”; tags have always been with us.) It’s great to see them increase both because they are a great way to get use out of the craziness while preserving it in its original form for others, and because there is great value in scaling tags, as Flickr has shown.

So, yay for tags. And yay for the crazy.

Be the first to comment »

May 20, 2013

[misc] The loneliness of the long distance ISBN

NOTE on May 23: OCLC has posted corrected numbers. I’ve corrected them in the post below; the changes are mainly fractional. So you can ignore the note immediately below.

NOTE a couple of hours later: OCLC has discovered a problem with the analysis. So please ignore the following post until further notice. Apologies from the management.

Ever since the 1960s, publishers have used ISBN numbers as identifiers of editions of books. Since the world needs unique ways to refer to unique books, you would think that ISBN would be a splendid solution. Sometimes and in some instances it is. But there are problems, highlighted in the latest analysis run by OCLC on its database of almost 300 million records.

Number of ISBNs

Percentage of the records





















So, 78% of the OCLC’s humungous collection of books records have no ISBN, and only 1.6% have the single ISBN that God intended.

As Roy Tennant [twitter: royTennant] of OCLC points out (and thanks to Roy for providing these numbers), many works in this collection of records pre-date the 1960s. Even so, the books with multiple ISBNs reflect the weakness of ISBNs as unique identifiers. ISBNs are essentially SKUs to identify a product. The assigning of ISBNs is left up to publishers, and they assign a new one whenever they need to track a book as an inventory item. This does not always match how the public thinks about books. When you want to refer to, say, Moby-Dick, you probably aren’t distinguishing between one with illustrations, a large-print edition, and one with an introduction by the Deadliest Catch guys. But publishers need to make those distinctions, and that’s who ISBN is intended to serve.

This reflects the more general problem that books are complex objects, and we don’t have settled ways of sorting out all the varieties allowed within the concept of the “same book.” Same book? I doubt it!

Still, these numbers from OCLC exhibit more confusion within the ISBN number space than I’d expected.

MINUTES LATER: Folks on a mailing list are wondering if the very high percentage of records with two ISBNs is due to the introduction of 13-digit ISBNs to supplement the initial 10-digit ones.

Be the first to comment »

March 2, 2013

[misc] The Wars on Terrorism, Al Qaeda, Cancer, and Dessert

Steve Coll has a good piece in the New Yorker about the importance of Al Qaeda as a brand:

…as long as there are bands of violent Islamic radicals anywhere in the world who find it attractive to call themselves Al Qaeda, a formal state of war may exist between Al Qaeda and America. The Hundred Years War could seem a brief skirmish in comparison.

This is a different category of issue than the oft-criticized “war on terror,” which is a war against a tactic, not against an enemy. The war against Al Qaeda implies that there is a structurally unified enemy organization. How do you declare victory against a group that refuses to enforce its trademark?

In this, the war against Al Qaeda (which is quite preferable to a war against terror — and I think Steve agrees) is similar to the war on cancer. Cancer is not a single disease and the various things we call cancer are unlikely to have a single cause and thus are unlikely to have a single cure (or so I have been told). While this line of thinking would seem to reinforce politicians’ referring to terrorism as a “cancer,” the same applies to dessert. Each of these terms probably does have a single identifying characteristic, which means they are not classic examples of Wittgensteinian family resemblances: all terrorism involves a non-state attack that aims at terrifying the civilian population, all cancers involve “unregulated cell growth” [thank you Wikipedia!], and all desserts are designed primarily for taste not nutrition and are intended to end a meal. In fact, the war on Al Qaeda is actually more like the war on dessert than like the war on cancer, because just as there will always be some terrorist group that takes up the Al Qaeda name, there will always be some boundary-pushing chef who declares that beefy jerky or glazed ham cubes are the new dessert. You can’t defeat an enemy that can just rebrand itself.

I think that Steve Coll comes to the wrong conclusion, however. He ends his piece this way:

Yet the empirical case for a worldwide state of war against a corporeal thing called Al Qaeda looks increasingly threadbare. A war against a name is a war in name only.

I agree with the first sentence, but I draw two different conclusions. First, this has little bearing on how we actually respond to terrorism. The thinking that has us attacking terrorist groups (and at times their family gatherings) around the world is not made threadbare by the misnomer “war against Al Qaeda.” Second, isn’t it empirically obvious that a war against a name is not a war in name only?

1 Comment »

October 16, 2012

[eim][semtechbiz] Enterprise Linked Data

David Wood of is talking about Callimachus, an open source project that is also available through his company. [NOTE: Liveblogging. All bets on accuracy are off.]

We’re moving from PCs to mobile, he says. This is rapidly changing the Internet. 51% of Internet traffic is non-human, he says (as of Feb 2012). 35hrs of video are uploaded to YouTube every minute. Traditionally we dealt with this type of demand via data warehousing: put it all in one place for easy access. But that’s not true: we never really got it all in one place accessible through one interface. Jeffrey Pollock says we should be talking not about data integration but interoperability because the latter implies a looser coupling.

He gives some use cases:

  • BBC wanted to have a Web presence for all of its 1500 broadcasts per day. They couldn’t do it manually. So, they decided to grab data from the linked open data data cloud and assemble the pages automatically. They hired fulltime editors to curate Wikipedia. RDF enabled them to assuemble the pages.

  • O’Reilly Media switched to RDF reluctantly but for purely pragmatic reasons.

  • BestBuy, too. They used RDFa to embed metadata into their pages to improve their SEO.

  • Elsevier uses Linked Data to manage their assets, from acquisition to delivery.

This is not science fiction, he says. It’s happening now.

Then two negative examples:

  • David says that Microsoft adopted RDF in the late 90s. But Netscape came out a portal tech based on RDF that scared Microsoft out of the standards effort. But they needed the tech, so they’ve reinvented it three times in proprietary ways.

  • Borders was too late in changing its tech.

Then he does a product pitch for Callimachus Enterperise: a content management system for enterprises.

Be the first to comment »

July 4, 2012

[eim] XKCD goes miscellaneous

Except Randall Munroe thinks going miscellaneous means giving up, rather than embracing the new organizational possibilities of blah blah blah.

(I am, of course, an awestruck fan of XKCD.)

1 Comment »

September 21, 2010

Everything is Warburg

The NY Review of Books gives a substantial taste of an upcoming article by Anthony Grafton and Jeffrey Hamburger about the library of the Warburg Institute. It organizes books on the shelves — it’s an open stacks library — into clusters of related materials, cutting across the usual subject classification. The University of London, which rescued and preserved the library, now is planning on dispersing its contents.

[The next day:] The full article has now been posted. Thanks, NYRB!


June 20, 2010

Twitter metadata and where standards come from

Matthew Ingram at Gigagom blogs about an upcoming Twitter feature called Twitter Annotations. Well, it’s not actually a feature. It’s the ability to attach metadata to a tweet. This is potentially great news, since it will give us a way to add context to tweets and to enable machine-processing of tweets, not to mention that URLs could be sent as metadata rather than as subtractions from the 140-character limit. This is yet another example of information scaling to the point where we have to introduce more information to manage it. How about one of those bogus “laws” people seem to like (well, I know I do): Information sufficiently scaled creates a need for more information.

Twitter is specifying the way in which Annotations will be encoded, but not what the metadata types will be. You can declare a “type” with its own set of “attributes.” What types? Whatever you (or, more exactly, developers and hackers) find useful. Matthew cites a number of folks who are basically positive but who express a variety of worries, including Google open advocate Chris Messina who warns that there could be a mare’s nest of standards, that is, values for types and attributes. Dave Winer takes Google to task for slagging off on Twitter for this. I agree with his sentiment that Goliath Google ought to be careful about their casual criticisms. Nevertheless, I think Chris is right: Specifying the syntax but not the actual types and attributes will inevitably give rise to confusion: What one person tags as “topic,” someone else will tag as “subject,” and some people might have the nerve to actually use words for types in, say, Spanish or Arabic. The nerve! [THE NEXT DAY: Here’s Chris’ original post on the topic, which is more balanced than the bit Matthew excerpts, and which basically agrees with the next paragraph:]

But, so what? I’d put my money on Ev Williams and Biz Stone any time (important note: If I had money). You couldn’t have seriously proposed an idea as ridiculous as Twitter in the first place if you didn’t deeply understand the Web. So, yes, Chris is right that there’ll be some confusion, but he’s wrong in his fear. After the confusion there will be a natural folksonomic (and capitalist) pull toward whatever terms we need the most. Twitter can always step in and suggest particular terms, or surface the relative popularity of the various types, so that if you want to make money by selling via tweets, you’ll learn to use the type “price” instead of “cost_to_user,” or whatever. Or you’ll figure out that most of the Twitter clients are looking for a type called “rating” rather than “stars” or “popularity.” There’ll be some mess. There’ll be some angry angry hash tags. But better open confusion than expecting anyone — even the Twitter Lads — to do a better job of guessing what its users need and what clever developers will invent than those users and developers themselves.


Next Page »