Joho the Blog » scholarship

March 6, 2013

[2b2k] Cliff Lynch on preserving the ever-expanding scholarly record

Cliff Lynch is giving talk this morning to the extended Harvard Library community on information stewardship. Cliff leads the Coalition for Networked Information, a project of the Association of Research Libraries and Educause, that is “concerned with the intelligent uses of information technology and networked information to enhance scholarship and intellectual life.” Cliff is helping the Harvard Library with the formulation of a set of information stewardship principles. Originally he was working with IT and the Harvard Library on principles, services, and initial projects related to digital information management. Given that his draft set of principles are broader than digital asset management, Cliff has been asked to address the larger community (says Mary Lee Kennedy).

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Cliff begins by saying that the principles he’s drafted are for discussion; how they apply to any particular institution is always a policy issue, with resource implications, that needs to be discussed. He says he’ll walk us through these principles, beginning with some concepts that underpin them.

When it comes to information stewardship, “university community” should include grad students whose research materials the university supports and maintains. Undergrads, too, to some extent. The presence of a medical school here also extends and smudges the boundaries.

Cliff then raises the policy question of the relation of the alumni to the university. There are practical reasons to keep the alumni involved, but particularly for grads of the professional schools, access to materials can be crucial.

He says he uses “scholarly record” for human-created things that convey scholarly ideas across time and space: books, journals, audio, web sites, etc. “This is getting more complicated and more diverse as time goes on.” E.g., author’s software can be part of that record. And there is a growing set of data, experimental records, etc., that are becoming part of the scholarly record.

Research libraries need to be concerned about things that support scholarship but are not usually considered part of the historical record. E.g., newspapers, popular novels, movies. These give insight into the scholarly work. There are also datasets that are part of the evidentiary record, e.g., data about the Earth gathered from sensors. “It’s so hard to figure out when enough is enough.” But as more of it goes digital, it requires new strategies for acquisition, curation and access. “What are the analogs of historical newspapers for the 21st century?” he asks. They are likely to be databases from corporations that may merge and die and that have “variable and often haphazard policies about how they maintain those databases.” We need to be thinking about how to ensure that data’s continued availability.

Provision of access: Part of that is being able to discover things. This shouldn’t require knowing which Harvard-specific access mechanism to come to. “We need to take a broad view of access” so that things can be found through the “key discovery mechanisms of the day,” beyond the institution’s. (He namechecks the Digital Public Library of America.)

And access isn’t just for “the relatively low-bandwidth human reader.” [API's, platforms and linked data, etc., I assume.]

Maintaining a record of the scholarly work that the community does is a core mission of the university. So, he says, in his report he’s used the vocabulary of obligation; that is for discussion.

The 5 principles

1. The scholarly output of the community should be captured, preserved, organized, and made accessible. This should include the evidence that underlies that output. E.g., the experimental data that underlies a paper should be preserved. This takes us beyond digital data to things like specimens and cell lines, and requires including museums and other partners. (Congress is beginning to delve into this, Cliff notes, especially with regard to preserving the evidence that enables experiments to be replicated.)

The university is not alone in addressing these needs.

2. A university has the obligation to provide its community with the best possible access to the overall scholarly record. This is something to be done in partnership with research libraries aaround the world. But Harvard has a “leadership role to play.”

Here we need to think about providing alumni with continued access to the scholarly record. We train students and then send them out into the world and cut off their access. “In many cases, they’re just out of luck. There seems to be something really wrong there.”

Beyond the scholarly record, there are issues about providing access to the cultural record and sources. No institution alone can do this. “There’s a rich set of partnerships” to be formed. It used to be easier to get that cultural record by buying it from book jobbers, DVD suppliers, etc. Now it’s data with differing license terms and subscription limitations. A lot out of it’s out on the public Web. “We’re all hoping that the Internet Archive will do a good job,” but most of our institutions of higher learning aren’t contributing to that effort. Some research libraries are creating interesting partnerships with faculty, collecting particular parts of the Web in support of particular research interests. “Those are signposts toward a future where the engagement to collect and preserve the cultural records scholar need is going to get much more complex” and require much more positive outreach by libraries, and much more discussion with the community (and the faculty in particular) about which elements are going to be important to preserve.

“Absolutely the desirable thing is share these collections broadly,” as broadly as possible.

3. “The time has come to recognize that good stewardship means creating digital records of physical objects” in order to preserve them and make them accessible. They should be stored away from the physical objects.

4. A lot goes on here in addition to faculty research. People come through putting on performances, talks, colloquia. “You need a strategy to preserve these and get them out there.”

“The stakes are getting much higher” when it comes to archives. The materials are not just papers and graphs. They include old computers and storage materials, “a microcosm of all of the horrible consumer recording technology of the 20th century,” e.g., 8mm film, Sony Betamax, etc.

We also need to think about what to archive of the classroom. We don’t have to capture every calculus discussion section, but you want to get enough to give a sense of what went on in the courses. The documentation of teaching and learning is undergoing a tremendous change. The new classroom tech and MOOCs are creating lots of data, much of it personally identifiable. “Most institutions have little or no policies around who gets to see it, how long they keep it, what sort of informed consent they need from students.” It’s important data and very sensitive data. Policy and stewardship discussions are need. There are also record management issues.

5. We know that scholarly communication is…being transformed (not as fast as some of us would like â?? online scientific journals often look like paper versions) by the affordances of digital technology. “Create an ongoing partnership with the community and with other institutions to extend and broaden the way scholarly communication happens. The institutional role is terribly important in this. We need to find the balances between innovation and sustainability.

Q&A

Q: Providing alumni with remote access is expensive. Harvard has about 100,000 living alumni, which includes people who spent one semester here. What sort of obligation does a university have to someone who, for example, spent a single semester here?

A: It’s something to be worked out. You can define alumnus as someone who has gotten a degree. You may ask for a co-payment. At some institutions, active members of the alumni association get some level of access. Also, grads of different schools may get access to different materials. Also, the most expensive items are typically those for which there are a commercial market. For example, professional grade resources for the financial industry probably won’t allow licensing to alumni because it would cannibalize their market. On the other hand, it’s probably not expensive to make JSTOR available to alumni.

Q: [robert darnton] Very helpful. We’re working on all 5 principles at Harvard. But there is a fundamental problem: we have to advance simultaneously on the digital and analog fronts. More printed books are published each year, and the output of the digital increases even faster. The pressures on our budget are enormous. What do you recommend as a strategy? And do you think Harvard has a special responsibility since our library is so much bigger, except for the Library of Congress? Smaller lilbraries can rely on Hathi etc. to acquire works.

A: “Those are really tough questions.” [audience laughs] It’s a large task but a finite one. Calculating how much money would take an institution how far “is a really good opportunity for fund raising.” Put in place measures that talk about the percentage of the collection that’s available, rather than a raw number of images. But, we are in a bad situation: continuing growth of traditional media (e.g., books), enormous expansion of digital resources. “My sense is…that for Harvard to be able to navigate this, it’s going to have to get more interdependent with other research libraries.” It’s ironic, because Harvard has been willing to shoulder enormous responsibility, and so has become a resource for other libraries. “It’s made life easier for a lot of the other research libraries” because they know Harvard will cover around the margins. “I’m afraid you may have to do that a little more for your scholars, and we are going to see more interdependence in the system. It’s unavoidable given the scope of the challenge.” “You need to be able to demonstrate that by becoming more interdependent, you’re getting more back than you’re giving up.” It’s a hard core problem, and “the institutional traditions make the challenge here unique.”

Be the first to comment »

October 1, 2012

[2b2k] Your business needs scholars

My latest column in KMWorld is about why your business needs scholars. In fact, though, it’s about why the idea of scholarship is more helpful than focusing your thinking on knowledge.

Be the first to comment »

October 26, 2011

[2b2k] Will digital scholarship ever keep up?

Scott F. Johnson has posted a dystopic provocation about the present of digital scholarship and possibly about its future.

Here’s the crux of his argument:

… as the deluge of information increases at a very fast pace — including both the digitization of scholarly materials unavailable in digital form previously and the new production of journals and books in digital form — and as the tools that scholars use to sift, sort, and search this material are increasingly unable to keep up — either by being limited in terms of the sheer amount of data they can deal with, or in terms of becoming so complex in terms of usability that the average scholar can’t use it — then the less likely it will be that a scholar can adequately cover the research material and write a convincing scholarly narrative today.

Thus, I would argue that in the future, when the computational tools (whatever they may be) eventually develop to a point of dealing profitably with the new deluge of digital scholarship, the backward-looking view of scholarship in our current transitional period may be generally disparaging. It may be so disparaging, in fact, that the scholarship of our generation will be seen as not trustworthy, or inherently compromised in some way by comparison with what came before (pre-digital) and what will come after (sophisticatedly digital).

Scott tentatively concludes:

For the moment one solution is to read less, but better. This may seem a luddite approach to the problem, but what other choice is there?

First, I should point out that the rest of Scott’s post makes it clear that he’s no Luddite. He understands the advantages of digital scholarship. But I look at this a little differently.

I agree with most of Scott’s description of the current state of digital scholarship and with the inevitability of an ever increasing deluge of scholarly digital material. But, I think the issue is not that the filters won’t be able to keep up with the deluge. Rather, I think we’re just going to have to give up on the idea of “keeping up” — much as newspapers and half hour news broadcasts have to give up the pretense that they are covering all the day’s events. The idea of coverage was always an internalization of the limitation of the old media, as if a newspaper, a broadcast, or even the lifetime of a scholar could embrace everything important there is to know about a field. Now the Net has made clear to us what we knew all along: most of what knowledge wanted to do was a mere dream.

So, for me the question is what scholarship and expertise look like when they cannot attain a sense of mastery by artificial limiting the material with which they have to deal. It was much easier when you only had to read at the pace of the publishers. Now you’d have to read at the pace of the writers…and there are so many more writers! So, lacking a canon, how can there be experts? How can you be a scholar?

I’m bad at predicting the future, and I don’t know if Scott is right that we will eventually develop such powerful search and filtering tools that the current generation of scholars will look betwixt-and-between fools (or as an “asterisk,” as Scott says). There’s an argument that even if the pace of growth slows, the pace of complexification will increase. In any case, I’d guess that deep scholars will continue to exist because that’s more a personality trait than a function of the available materials. For example, I’m currently reading Armies of Heaven, by Jay Rubenstein. The depth of his knowledge about the First Crusade is astounding. Astounding. As more of the works he consulted come on line, other scholars of similar temperament will find it easier to pursue their deep scholarship. They will read less and better not as a tactic but because that’s how the world beckons to them. But the Net will also support scholars who want to read faster and do more connecting. Finally (and to me most interestingly) the Net is already helping us to address the scaling problem by facilitating the move of knowledge from books to networks. Books don’t scale. Networks do. Although, yes, that fundamentally changes the nature of knowledge and scholarship.

[Note: My initial post embedded one draft inside another and was a total mess. Ack. I've cleaned it up - Oct. 26, 2011, 4:03pm edt.]

3 Comments »

July 10, 2010

[2b2k] Understanding’s web

I’m on a mailing list that discusses the philosopher Martin Heidegger. Many years ago I was a fledgling Heidegger scholar, but now I am on the list strictly as a tourist.

Today someone posted: “If you don’t know German you don’t have a ghost of a chance of understanding Heidegger.” A few people posted immediately in reaction to the “dismissive” tone of the comment. I felt the same way, but then thought, hmm, this is an empirical question, isn’t it? List the people you think understand Heidegger best — or pick some other writer in some other language — and see how many of them don’t read him in his original language. There is something true about the dismissive remark.

But, there is something false as well. It draws too strong a line between understanding and not understanding. I obviously don’t understand Heidegger as well as the full-time scholars on the mailing list do. But, having studied Heidegger for several years of my life (I wrote my doctoral dissertation on him), I’m pretty sure I understand him better than most who haven’t studied him do. If we acknowledge that our understanding improves as we read and study more, we acknowledge that understanding doesn’t fall into only two buckets: understands or doesn’t have a ghost of a chance of understanding.

For the original comment to be empirically true, we’d either have to show that (a) there is a clear line between those who understand and those who do not (and that reading the original language is a requirement for getting into that first bucket). Or, (b) we could say that the commenter is actually talking about having professional standing as a scholar: You cannot claim to be a Heidegger scholar if you can only read him in translation. The first alternative seems to me to be ridiculous. The second seems far more plausible. The problems arise when someone applies the bright perimeter of professionalism to the messy web of understanding.

I certainly do believe that had my German been better — it was barely adequate at the time, and now has devolved into very basic travel glossary stuff — I could have understood Heidegger better. Likewise, better understanding the history of philosophy, knowing early 20th century German politics, reading Greek and Latin, and being conversant in German poetry all would have helped me understand Heidegger better. There is no end to what we need to know in order to understand the thought of another, because there is no such state as Understanding that excludes all doubt, excludes all errors, and excludes all others.

Finally, it’s not at all clear to me that if we list those whose understanding of a thinker we most respect, they will be in rank order based upon how many of the Professional Requirements they’ve mastered. Some of the best Heidegger scholars — and you can pick your own criteria of bestness — may be weak in Greek, weaker in German politics, but very strong in poetry. Others might have other sets of strengths and weaknesses. Not only doesn’t understanding necessarily correspond to the fields mastered, the community of scholars ameliorates the weaknesses of individuals by writing works that others read: A scholar weak on politics reads the work of scholars strong on politics. Understanding in this sense is a networked property, and a very messy one indeed.

24 Comments »

April 13, 2009

New criteria for academic recognition

The University of Maine has approved new guidelines for tenuring and promoting academics [later:] in the New Media program (although see the comments for a complexification of this). The new guidelines allow crediting an academic for contributing to social media.

This the right thing to do not only because it is a more realistic assessment of an academic’s worth. It’s also the right thing to do because it helps to build the value of the network. If knowledge and expertise are becoming properties of the network, it is the social responsibility of our institutions to encourage the enhancement of that network.

[Tags: ]

7 Comments »

February 26, 2009

[berkman] Peter Suber on the future of open access

Peter Suber, Research Prof. of Philosophy at Earlham College, a visiting fellow at Yale Law’s Information Society Project, and blogger of open access news, is giving a full-house lecture at Harvard, sponsored by the Berkman Center. [Note: I'm live blogging, making mistakes, leaving things out, paraphrasing ineptly, etc. POSTED WITHOUT PROOFING or even with a basic re-reading. Speed over accuracy. Welcome to the Web :(]

Peter says he’s going to assume that we know what open access is, etc. But he does want to define Green Open Access (= open access through a repository) and Gold OA (= OA through a journal). There’s also Gratis OA (free of charge but may be licensing restrictions) and Libre OA (free of charge and free of licensing restrictions).

Peter says he doesn’t know the future of OA. He likes Alan Kaye’s comment that the future is easier to make than predict. He’s going to talk about 12 cross-over points in OA, in rough order of when they might occur:

1. For-pay journals allow green OA. About 63% of these journals already do this.

2. OA books:: When there are more gratis OA books online than in the average university library. We crossed this a couple of years ago. “The permission problem is harder than digitization.” The next cross over point here is getting more libre OA books online, which we are quite a distance from.

3. Funder policies: “When most publicly-funded research is subject to OA mandates.” This seems to be spreading, Peter says. Today, 32 public funders and more than 3 private funders have OA mandates.


4. Green OA deposits: “When most new peer-reviewed manuscripts are self-archived when accepted for publication.” In particle physics, this happens routinely. If 20% of researchers publish 80% of the articles, we could reach cross-over fairly quickly in some fields.

5. Author understanding: “When most publishing researchers have an accurate understanding of OA.” This is happening, but notvery quickly.

6. University repositories: “When most universities have institutional repositories,” individually or as part of a consortium. This is happening slowly. In the absence of a universal repository, every university ought to have one. Universities will get to this point more slowly than funders because they move more slowly than funders. And we ought to ask why. Aren’t universities interests in line with OA?

Libre gold OA: “When most OA journals are libre OA.” Most OA journals are still merely gratis, but curb copying to drive traffic to their site. This crossover could happen overnight if the journals understood the issues. They’d lose a little traffic, but nothing else. There are grounds for optimism: Open Access Scholarly Publishers Association is an assoc of OA journal publishers and it requires libre gold OA. The SPARC Europe program sets standards for what a good OA journal is, and it recommends CreativeCommons attribution licenses. These two orgs are helpful because there’s no topdown org defining OA, so we rely on bottom up orgs like these two to set the standards.

8. Journal backfiles: “When most TA journals have OA backfiles.” This is expensive to do. Google will do it, but Google’s terms are difficult: They don’t give the journal a copy of the digital files. (Libraries do get copies of the files of the books they let Google scan.) The OCA focuses on public domain literature. “Once digitized, the benefits of increased visibility and citations should outweigh the trickle of revenue.” Journals make most of their money from new issues, so having greater presence should help. In physics, almost 100% of articles are available OA but the publishers can’t see any dip in subscriptions.

9. Author addenda: “When most new research is covered by author addenda” (i.e., additions that grant OA permission, tacked onto standard publisher-author contracts). Now there are few adopters. It’d be good to standardize these. The cross over will come when universities or funders require it. If enough journals allow green OA, that’d make addenda unnecessary.

10. University policies: “When most university research is subject to university-level OA mandates.” Today, 27 universities and 4 depts have these mandates. It’d help to have the largest/most productive universities move on this first.

11. OA journals: “When most peer-reviewed journals are OA.” “I don’t expect this for a long time.” Now 15% are OA. Progress is slow, but there is progress. High prestige journals are likely to hold out for a long time.

Libre green OA: “When most green OA is libre OA.” Today, only a small fraction is libre OA because most OA repositories depend on permission from publishers. UKPMC Funders Group demands green libre. We will reach the cross-over “when it’s safe.” Harvard has taken the lead on this, Peter says, and it will spread as another large university takes this step, then another one … “It becomes self-fulfilling leadership.”

Q: Are we stuck with the Sonny Bono copyright extension act?
A: Yes. All copyright reform in the past few decades have been in the wrong direction. And it’s very hard to roll back copyright terms. The only silver lining is that when we have a consenting partner, we can bypass copyright via contract. The problem is that they’re not the default.

Q: Under libre OA, how are our scholarly attributions protected?
A: It’s a range. One end is public domain, which does not preserve attribution. But all the CreativeCommons licenses preserve attribution. Most scholars don’t want public domain; they want CC-attribution.”Creative Commons attribution license provides everything a scholar could want.”

Q: [me] Conyers!
A: The Conyers bill would tell agencies not to require OA for works they fund. [I've put this badly.] The OA advocates are fighting it, but the agencies affected are not yet. I think Conyers is serious about it. I think he introduced it early because we don’t have a Sect’y of Health and Human Services or of NIH. Conyers may be fighting a turf battle. [Paraphrasing!] “He’s motivated primarily to protect the jurisdiction of his committee.” Peter thinks it won’t pass, but it might be introduced into another bill. “We’d like to spread the NIH policy to the rest of the government.”

Q: Is the economic downturn accelerating the adoption of OA?
A: The NIH just got $10B in the stimulus, which means there will be more and more OA articles. NSF also, but not as much because NSF requires OA for reports generated by those they fund [may not have gotten this right]. But, Peter thinks the downturn strengthens the case for OA. Libraries are going to be canceling subscriptions. And the Stimulus’ emphasis on green research will be more valuable if it’s OA. Open access to research amplifies its value.

Q: [jpalfrey] You’ve noticed there’s no OSF equivalent for OA. But I’d argue that the people in this room — librarians — are your OA OSF. What do you say to these librarians to advance our common cause?
A: Librarians are among the most important allies in the OA movement. But put all the allies together and you still don’t have OSF. Libraries should be sending letters against the Conyers bill. When you negotiate subscriptions you should negotiate the right to pur articles from your authors into an OA repository. Libraries are the only buyers of peer-reviewed journals. When you’re the only buyer, you can dictate your terms, subject to anti-trust. Obama says that we have the right to demand transformation from the banks we’re saving. Librarians can do the same thing for journals. Journals are not serving all of our interests and are acting against other interests. Use your bargaining power. Get the right of self-archive, and, when the time is right, get the right of libre self-archive. Network with one another when you launch repositories. And, btw, every school with an enlightened OA policy had librarians in the head of the charge.

Q: Can you give an example of an archive that works?
A: Universities that have mandatory language still have to supplement the language with incentives and education. As you go from unmandated to mandated, it goes from 15% toward 100%. (15% is the average for voluntary, spontaneous archives.) It works best in the Dutch universities that let the “cream” of the article rise to the top. Every week they feature good work in public. This gets academics to archive their work without a mandate.

Q: Might universities work with publishers collaboratively to create new business models?
A: Publishers differ in their attitudes toward OA. Some are experimenting in good faith. Some, in bad faith. Some who do OA are actively lobbying for the Conyers bill. Librarians understand the scholarly landscape better than publishers and could educate publishers. Society publishers [i.e., societies that publish] could be told that they’re threatened not by OA but by the “big deal” that brings in academic journals.

Q: Is Springer’s taking over of BioMed Central a good thing?
A: Yes. BMC is for-profit. BMC was the world’s largest OA publisher. Now Springer is. Springer says that OA is a sustainable part of their bsiness. My reading is that Springer is preparing for an OA future. [Tags: ]

6 Comments »

February 21, 2009

Law libraries ask for open access

Directors of ten law school libraries, including Harvard’s John Palfrey, have signed an “aspirational” document, called the Durham Statement on Open Access, that “calls for all law schools to stop publishing their journals in print format and to rely instead on electronic publication coupled with a commitment to keep the electronic versions available in stable, open, digital formats.”

This is wonderful.

The statement calls for the end of paper versions of the journals, not merely supplementing them with electronic versions, because printing them costs so much and is bad for the environment. I don’t know if the drafters of the Statement were also thinking that going purely digital would help force a change in mindsets, but I suspect that that would be one of the most important consequences.

[Tags: ]

6 Comments »

February 12, 2008

Harvard to vote on open access proposal

The NY Times reports that Harvard’s Faculty of Arts and Sciences will vote next week on a proposal that would require faculty to deposit a copy of their articles in an open access Harvard repository even as they submit those articles to academic journals.

I like this idea a lot. I only wish it went further. Faculty members will be allowed to opt-out of the requirement pretty much at will (as I understand it), which could vitiate it: If a prestigious journal accepts an article but only if it’s not been made openly available, faculty members may well decide it’s more important for their careers to be published in the journal. I would prefer to see the Harvard proposal paired with some form of official encouragement to tenure committees to look favorably upon faculty members who make their work widely and freely available.

Nothing is without drawbacks. A well-run, reliable, thorough peer-review system costs money. But there’s also an expense to funding peer review by limiting access to the work that makes it through the process. Likewise, while the current publication system directs our attention efficiently, but there’s a price to the very efficiency of such a system: innovation can arise from what looked liked inefficiencies. There’s value in the long tail of research.

If we were today building a system for evaluating scholarly research and for making it maximally available, we would not build anything like the current paper-based system. Well, we are building such a system. The Harvard proposal will, in my opinion, help.

Disclosure: I’m a fellow at the Berkman Center which is part of the Law School, not the Faculty of Arts and Sciences, and I’m not a faculty member in any case. Stuart Shieber, one of the sponsors of the proposal, is a director of the Center.) [Tags: ]

8 Comments »


Switch to our mobile site