logo

Let’s just see what happens

Mobile Version

About me

Newsletter

Videos

Speaker

Hard to Read? Choose a style: Style 1 Style 2 Style 3 Default Toggle Sidebars

Google Books metadata meta-wreck

Posted on September 4th, 2009

Geoff Nunberg has a fantastic post warning about the poor quality of the metadata attached to the books Google is scanning into its soon to be dominant-to-the-point-of-monopoly digital library. Apparently, the attempt to gather metadata automatically from the scans has resulted in the introduction of legions of errors. But the real problems are, as Geoff points out, that Google seems not to have a plan for dealing with this problem and that it has not opened up the metadata design process.

[Tags: google_books libraries metadata worldcat everything_is_miscellaneous ]

Tagged with: everythingIsMiscellaneous • everything_is_miscellaneous • google_books • libraries • metadata • worldcat

Previous: « Hire execs who love your product || Next: The price of free law »

4 Responses to “Google Books metadata meta-wreck”

  1. Brian Cartwright, on September 5th, 2009 at 5:24 pm Said:

    I’m wondering if Google is receptive to users’ input. In another of their services, the translation available for web pages with many languages included, they ask for users to suggest better translations. On the occasions when I have had Google translate for me, the results were fair to poor. But do you think they have a strategy for learning to translate better that uses our input constructively? I’m inclined to believe they do, and if so maybe the digital library could do the same.

  2.  

  3. Mirek Sopek, on September 6th, 2009 at 12:58 am Said:

    When it comes to any metadata related issue, Google seems to be very secretive. It is for books – it is also for their initial steps toward Semantic Web. No one understands why they used primitive and useless ontology they invented themselves.
    I’m not in favour of conspiracy theories, but sometimes I tend to suspect Google of doing something strange when it comes to semantics of popular objects they handle…

    BTW, The original post was great – I laughed at:

    “The Mosaic Navigator: The essential guide to the Internet Interface” dated 1939 and attributed to Sigmund Freud ….”

    :-)

  4.  

  5. Brian Cartwright, on September 6th, 2009 at 9:19 am Said:

    There’s a lengthy posting by Google’s metadata manager Jon Orwent in the Language Log attached to Geoff Nunberg’s blog (link from DW above) that gives some indication of internal policy.

  6.  

  7. Eric Rumsey, on September 15th, 2009 at 3:58 pm Said:

    Google Book Search certainly does have metadata problems. But so do Library catalogs — See my example based on “Everything is Miscellaneous” [thanks for that wonderful book!] …
    http://blog.lib.uiowa.edu/hardinmd/2009/09/03/metadata-about-metadata-library-catalog-fail/

  8.  

Leave a Reply


Web Joho only

 

Entries (RSS)
Copy this link as RSS address

Comments (RSS).

Creative Commons License
Joho the Blog by David Weinberger is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. Share it freely, but attribute it to me, and don't use it commercially without my permission.

Joho the blog uses WordPress blogging software.
Thanks, WordPress!