Joho the BlogGoogle Books metadata meta-wreck - Joho the Blog

Google Books metadata meta-wreck

Geoff Nunberg has a fantastic post warning about the poor quality of the metadata attached to the books Google is scanning into its soon to be dominant-to-the-point-of-monopoly digital library. Apparently, the attempt to gather metadata automatically from the scans has resulted in the introduction of legions of errors. But the real problems are, as Geoff points out, that Google seems not to have a plan for dealing with this problem and that it has not opened up the metadata design process.

[Tags: ]

4 Responses to “Google Books metadata meta-wreck”

  1. I’m wondering if Google is receptive to users’ input. In another of their services, the translation available for web pages with many languages included, they ask for users to suggest better translations. On the occasions when I have had Google translate for me, the results were fair to poor. But do you think they have a strategy for learning to translate better that uses our input constructively? I’m inclined to believe they do, and if so maybe the digital library could do the same.

  2. When it comes to any metadata related issue, Google seems to be very secretive. It is for books – it is also for their initial steps toward Semantic Web. No one understands why they used primitive and useless ontology they invented themselves.
    I’m not in favour of conspiracy theories, but sometimes I tend to suspect Google of doing something strange when it comes to semantics of popular objects they handle…

    BTW, The original post was great – I laughed at:

    “The Mosaic Navigator: The essential guide to the Internet Interface” dated 1939 and attributed to Sigmund Freud ….”


  3. There’s a lengthy posting by Google’s metadata manager Jon Orwent in the Language Log attached to Geoff Nunberg’s blog (link from DW above) that gives some indication of internal policy.

  4. Google Book Search certainly does have metadata problems. But so do Library catalogs — See my example based on “Everything is Miscellaneous” [thanks for that wonderful book!] …

Web Joho only

Comments (RSS).  RSS icon