During this seemingly-endless interregnum when we have e-books that suck at letting us take notes, I buy paper books when I’m doing research. I have a complex little application I’ve endlessly developed over the years that lets me type notes into a plain text editor or OPML-based outliner using a minimal markup. The app turns the notes into a database that I can then slice ‘n’ dice. Someday I’ll get it stable and done enough to publish. And that day is never.
A couple of years ago I wrote a Chrome extension (“Kindle Highlights Exporter”) that scrapes all of the passages you’ve highlighted with your Kindle, exporting them as a csv, xml, or json file. The only problem is that I seem to be the only person it works for. More precisely, it crashed for the only person I ever showed it to, my supersmart developer nephew. It still works for me, though. If you want (yet another) chance to laugh at me, feel free to download it and install it. Suckers.
So, how about if someone were to write some software that lets me import photographs of the pages of a book that I’ve highlighted in, say, yellow. The app finds the highlighted portions of each page, looks for the page number, does the requisite OCR, and returns a well-marked-up set of those annotations. (These days, outputting in the Open Annotation standard, as well as the usual suspects, would be extra cool.) That way, when I’m done with a book, I could snap images of all the pages with highlights and get a list at the end, instead of doing what I do now: type them in as I read.
I’d give it a try, but processing images is waaay beyond my hobbyist-programmer capabilities. As for the possible copyright violation: OH FOR HEAVENS SAKE WHAT THE HELL IS WRONG WITH US? (Note: The previous sentence should not be construed as legal advice.)
In any case, as the digital/networked world continues to develop its superpowers, the mud wall that confines the physical becomes more and more aggravating.