Joho the Blog » digital humanities

November 24, 2014

[siu] Accessing content

Alex Hodgson of ReadCube is leading a panel called “Accessing Content: New Thinking and New Business Models or Accessing Research Literature” at the Shaking It Up conference.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Robert McGrath is from ReadCube, a platform for managing references. You import your pdfs, read them with their enhanced reader, and can annotate them and discover new content. You can click on references in the PDF and go directly to the sources. If you hit a pay wall, they provide a set of options, including a temporary “checkout” of the article for $6. Academic libraries can set up a fund to pay for such access.

Eric Hellman talks about Unglue.it. Everyone in the book supply chain wants a percentage. But free e-books break the system because there are no percentages to take. “Even libraries hate free ebooks.” So, how do you give access to Oral Literature in Africain Africa? Unglue.it ran a campaign, raised money, and liberated it. How do you get free textbooks into circulation? Teachers don’t know what’s out there. Unglue.it is creating MARC records for these free books to make it easy for libraries to include the. The novel Zero Sum Game is a great book that the author put it out under a Creative Commons license, but how do you find out that it’s available? Likewise for Barbie: A Computer Engineer, which is a legal derivative of a much worse book. Unglue.it has over 1,000 creative commons licensed books in their collection. One of Unglue.it’s projects: an author pledges to make the book available for free after a revenue target has been met. [Great! A bit like the Library License project from the Harvard Library Innovation Lab. They’re now doing Thanks for Ungluing which aggregates free ebooks and lets you download them for free or pay the author for it. [Plug: John Sundman’s Biodigital is available there. You definitely should pay him for it. It’s worth it.]

Marge Avery, ex of MIT Press and now at MIT Library, says the traditional barriers sto access are price, time, and format. There are projects pushing on each of these. But she mainly wants to talk about format. “What does content want to be?” Academic authors often have research that won’t fit in the book. Univ presses are experimenting with shorter formats (MIT Press Bits), new content (Stanford Briefs), and publishing developing, unifinished content that will become a book (U of Minnesota). Cambridge Univ Press published The History Manifesto, created start to finish in four months and is available as Open Access as well as for a reasonable price; they’ve sold as many copies as free copies have been downloaded, which is great.

William Gunn of Mendeley talks about next-gen search. “Search doesn’t work.” Paul Kedrosky was looking for a dishwasher and all he found was spam. (Dishwashers, and how Google Eats Its Own Tail). Likewise, Jeff Atwood of StackExchange: “Trouble in the House of Google.” And we have the same problems in scholarly work. E.g., Google Scholar includes this as a scholarly work. Instead, we should be favoring push over pull, as at Mendeley. Use behavior analysis, etc. “There’s a lot of room for improvement” in search. He shows a Mendeley search. It auto-suggests keyword terms and then lets you facet.

Jenn Farthing talks about JSTOR’s “Register and Read” program. JSTOR has 150M content accesses per year, 9,000 institutions, 2,000 archival journals, 27,000 books. Register and Read: Free limited access for everyone. Piloted with 76 journals. Up to 3 free reads over a two week period. Now there are about 1,600 journals, and 2M users who have checked out 3.5M articles. (The journals are opted in to the program by their publishers.)

Q&A

Q: What have you learned in the course of these projects?

ReadCube: UI counts. Tracking onsite behavior is therefore important. Iterate and track.

Marge: It’d be good to have more metrics outside of sales. The circ of the article is what’s really of importance to the scholar.

Mendeley: Even more attention to the social relationships among the contributors and readers.

JSTOR: You can’t search for only content that’s available to you through Read and Register. We’re adding that.

Unglue.it started out as a crowdfunding platform for free books. We didn’t realize how broken the supply chain is. Putting a book on a Web site isn’t enough. If we were doing it again, we’d begin with what we’re doing now, Thanks for Ungluing, gathering all the free books we can find.

Q: How to make it easier for beginners?

Unglue .it: The publishing process is designed to prevent people from doing stuff with ebooks. That’s a big barrier to the adoption of ebooks.

ReadCube: Not every reader needs a reference manager, etc.

Q: Even beginning students need articles to interoperate.

Q: When ReadCube negotiates prices with publishers, how does it go?

ReadCube: In our pilots, we haven’t seen any decline in the PDF sales. Also, the cost per download in a site license is a different sort of thing than a $6/day cost. A site license remains the most cost-effective way of acquiring access, so what we’re doing doesn’t compete with those licenses.

Q: The problem with the pay model is that you can’t appraise the value of the article until you’ve paid. Many pay models don’t recognize that barrier.

ReadCube: All the publishers have agreed to first-page previews, often to seeing the diagrams. We also show a blurred out version of the pages that gives you a sense of the structure of the article. It remains a risk, of course.

Q: What’s your advice for large legacy publishers?

ReadCube: There’s a lot of room to explore different ways of brokering access — different potential payers, doing quick pilots, etc.

Mendeley: Make sure your revenue model is in line with your mission, as Geoff said in the opening session.

Marge: Distinguish the content from the container. People will pay for the container for convenience. People will pay for a book in Kindle format, while the content can be left open.

Mendeley: Reading a PDF is of human value, but computing across multiple articles is of emerging value. So we should be getting past the single reader business model.

JSTOR: Single article sales have not gone down because of Read and Register. They’re different users.

Unglue.it: Traditional publishers should cut their cost basis. They have fancy offices in expensive locations. They need to start thinking about how they can cut the cost of what they do.

1 Comment »

May 2, 2014

[2b2k] Digital Humanities: Ready for your 11AM debunking?

The New Republic continues to favor articles debunking claims that the Internet is bringing about profound changes. This time it’s an article on the digital humanities, titled “The Pseudo-Revolution,” by Adam Kirsch, a senior editor there. [This seems to be the article. Tip of the hat to Jose Afonso Furtado.]

I am not an expert in the digital humanities, but it’s clear to the people in the field who I know that the meaning of the term is not yet settled. Indeed, the nature and extent of the discipline is itself a main object of study of those in the discipline. This means the field tends to attract those who think that the rise of the digital is significant enough to warrant differentiating the digital humanities from the pre-digital humanities. The revolutionary tone that bothers Adam so much is a natural if not inevitable consequence of the sociology of how disciplines are established. That of course doesn’t mean he’s wrong to critique it.

But Adam is exercised not just by revolutionary tone but by what he perceives as an attempt to establish claims through the vehemence of one’s assertions. That is indeed something to watch out for. But I think it also betrays a tin-eared reading by Adam. Those assertions are being made in a context the authors I think properly assume readers understand: the digital humanities is not a done deal. The case has to be made for it as a discipline. At this stage, that means making provocative claims, proposing radical reinterpretations, and challenging traditional values. While I agree that this can lead to thoughtless triumphalist assumptions by the digital humanists, it also needs to be understood within its context. Adam calls it “ideological,” and I can see why. But making bold and even over-bold claims is how discourses at this stage proceed. You challenge the incumbents, and then you challenge your cohort to see how far you can go. That’s how the territory is explored. This discourse absolutely needs the incumbents to push back. In fact, the discourse is shaped by the assumption that the environment is adversarial and the beatings will arrive in short order. In this case, though, I think Adam has cherry-picked the most extreme and least plausible provocations in order to argue against the entire field, rather than against its overreaching. We can agree about some of the examples and some of the linguistic extensions, but that doesn’t dismiss the entire effort the way Adam seems to think it does.

It’s good to have Adam’s challenge. Because his is a long and thoughtful article, I’ll discuss the thematic problems with it that I think are the most important.

First, I believe he’s too eager to make his case, which is the same criticism he makes of the digital humanists. For example, when talking about the use of algorithmic tools, he talks at length about Franco Moretti‘s work, focusing on the essay “Style, Inc.: Reflections on 7,000 Titles.” Moretti used a computer to look for patterns in the titles of 7,000 novels published between 1740 and 1850, and discovered that they tended to get much shorter over time. “…Moretti shows that what changed was the function of the title itself.” As the market for novels got more crowded, the typical title went from being a summary of the contents to a “catchy, attention-grabbing advertisement for the book.” In addition, says Adam, Moretti discovered that sensationalistic novels tend to begin with “The” while “pioneering feminist novels” tended to begin with “A.” Moretti tenders an explanation, writing “What the article ‘says’ is that we are encountering all these figures for the first time.”

Adam concludes that while Moretti’s research is “as good a case for the usefulness of digital tools in the humanities as one can find” in any of the books under review, “its findings are not very exciting.” And, he says, you have to know which questions to ask the data, which requires being well-grounded in the humanities.

That you need to be well-grounded in the humanities to make meaningful use of digital tools is an important point. But here he seems to me to be arguing against a straw man. I have not encountered any digital humanists who suggest that we engage with our history and culture only algorithmically. I don’t profess expertise in the state of the digital humanities, so perhaps I’m wrong. But the digital humanists I know personally (including my friend Jeffrey Schnapp, a co-author of a book, Digital_Humanities, that Adam reviews) are in fact quite learned lovers of culture and history. If there is indeed an important branch of digital humanities that says we should entirely replace the study of the humanities with algorithms, then Adam’s criticism is trenchant…but I’d still want to hear from less extreme proponents of the field. In fact, in my limited experience, digital humanists are not trying to make the humanities safe for robots. They’re trying to increase our human engagement with and understanding of the humanities.

As to the point that algorithmic research can only “illustrate a truism rather than discovering a truth,” — a criticism he levels even more fiercely at the Ngram research described in the book Uncharted — it seems to me that Adam is missing an important point. If computers can now establish quantitatively the truth of what we have assumed to be true, that is no small thing. For example, the Ngram work has established not only that Jewish sources were dropped from German books during the Nazi era, but also the timing and extent of the erasure. This not only helps make the humanities more evidence-based —remember that Adam criticizes the digital humanists for their argument-by-assertion —but also opens the possibility of algorithmically discovering correlations that overturn assumptions or surprise us. One might argue that we therefore need to explore these new techniques more thoroughly, rather than dismissing them as adding nothing. (Indeed, the NY Times review of Uncharted discusses surprising discoveries made via Ngram research.)

Perhaps the biggest problem I have with Adam’s critique I’ve also had with some digital humanists. Adam thinks of the digital humanities as being about the digitizing of sources. He then dismisses that digitizing as useful but hardly revolutionary: “The translation of books into digital files, accessible on the Internet around the world, can be seen as just another practical tool…which facilitates but does not change the actual humanistic work of thinking and writing.”

First, that underplays the potential significance of making the works of culture and scholarship globally available.

Second, if you’re going to minimize the digitizing of books as merely the translation of ink into pixels, you miss what I think is the most important and transformative aspect of the digital humanities: the networking of knowledge and scholarship. Adam in fact acknowledges the networking of scholarship in a twisty couple of paragraphs. He quotes the following from the book Digital_Humanities:

The myth of the humanities as the terrain of the solitary genius…— a philosophical text, a definitive historical study, a paradigm-shifting work of literary criticism — is, of course, a myth. Genius does exist, but knowledge has always been produced and accessed in ways that are fundamentally distributed…

Adam responds by name-checking some paradigm-shifting works, and snidely adds “you can go to the library and check them out…” He then says that there’s no contradiction between paradigm-shifting works existing and the fact that “Scholarship is always a conversation…” I believe he is here completely agreeing with the passage he thinks he’s criticizing: genius is real; paradigm-shifting works exist; these works are not created by geniuses in isolation.

Then he adds what for me is a telling conclusion: “It’s not immediately clear why things should change just because the book is read on a screen rather than on a page.” Yes, that transposition doesn’t suggest changes any more worthy of research than the introduction of mass market paperbacks in the 1940s [source]. But if scholarship is a conversation, might moving those scholarly conversations themselves onto a global network raise some revolutionary possibilities, since that global network allows every connected person to read the scholarship and its objects, lets everyone comment, provides no natural mechanism for promoting any works or comments over any others, inherently assumes a hyperlinked rather than sequential structure of what’s written, makes it easier to share than to sequester works, is equally useful for non-literary media, makes it easier to transclude than to include so that works no longer have to rely on briefly summarizing the other works they talk about, makes differences and disagreements much more visible and easily navigable, enables multiple and simultaneous ordering of assembled works, makes it easier to include everything than to curate collections, preserves and perpetuates errors, is becoming ubiquitously available to those who can afford connection, turns the Digital Divide into a gradient while simultaneously increasing the damage done by being on the wrong side of that gradient, is reducing the ability of a discipline to patrol its edges, and a whole lot more.

It seems to me reasonable to think that it is worth exploring whether these new affordances, limitations, relationships and metaphors might transform the humanities in some fundamental ways. Digital humanities too often is taken simply as, and sometimes takes itself as, the application of computing tools to the humanities. But it should be (and for many, is) broad enough to encompass the implications of the networking of works, ideas and people.

I understand that Adam and others are trying to preserve the humanities from being abandoned and belittled by those who ought to be defending the traditional in the face of the latest. That is a vitally important role, for as a field struggling to establish itself digital humanities is prone to over-stating its case. (I have been known to do so myself.) But in my understanding, that assumes that digital humanists want to replace all traditional methods of study with computer algorithms. Does anyone?

Adam’s article is a brisk challenge, but in my opinion he argues too hard against his foe. The article becomes ideological, just as he claims the explanations, justifications and explorations offered by the digital humanists are.

More significantly, focusing only on the digitizing of works and ignoring the networking of their ideas and the people discussing those ideas, glosses over the locus of the most important changes occurring within the humanities. Insofar as the digital humanities focus on digitization instead of networking, I intend this as a criticism of that nascent discipline even more than as a criticism of Adam’s article.

3 Comments »

January 24, 2012

Digital humanities

Skip Walter’s post about his growing acceptance and understanding of the need for digital humanities hits on so many of my intellectual pleasure spots, starting with Russ Ackoff’s knowledge network, and including Kate Hayles and Cathy Davidson, and more and more. (Yes, he mentions “Too Big to Know” in passing, but that’s irrelevant to my reaction.)

2 Comments »