Joho the Blog » dpla

June 11, 2012

DPLA West meeting online

The sessions from the DPLA Plenary meeting on April 27 in SF are now online. Here’s the official announcement:

…all media and work outputs from the two day-long events that made up DPLA West–the DPLA workstream meetings held on April 26, 2012 at the San Francisco Public Library, and the public plenary held on April 27, 2012 at the Internet Archive in San Francisco, CA–are now available online on the “DPLA West: Media and Outputs” page:http://dp.la/get-involved/events/dplawest/dpla-west-media-and-outputs/.

There you will find:

  • Key takeaways from the April 26, 2012 workstream meetings;

  • Notes from the April 27, 2012 Steering Committee meeting;

  • Complete video of the April 27, 2012 public plenary;

  • Photographs and graphic notes from the public plenary;

  • Video interviews with DPLA West participants;

  • And audio interviews with DPLA West scholarship recipients.

More information about DPLA West can be found online at http://dp.la/get-involved/events/dplawest/.

Folks from the Harvard Library Innovation Lab and the Berkman Center worked long and hard to create a prototype software platform for the DPLA in time for this event. The platform is up and gives live access to about 20M books and thousands of images and other items from various online collections. The session at which we introduced, explained, and demo’ed it is now available for your viewing pleasure. (I was interim head of the project.)

Be the first to comment »

June 6, 2012

1,000 downloads

I learned yesterday from Robin Wendler (who worked mightily on the project) that Harvard’s library catalog dataset of 12.3M records has been bulk downloaded a thousand times, excluding the Web spiderings. That seems like an awful lot to me, and makes me happy.

The library catalog dataset comprises bibliographic records of almost all of Harvard Library’s gigantic collection. It’s available under a CC 0 public domain license for bulk download, and can be accessed through an API via the DPLA’s prototype platform. More info here.

1 Comment »

February 21, 2012

Joho: Culture is an echo chamber

After a couple of years, I’ve actually published another issue of my old ‘zine. Why so long between issues? Basically, blogging ate my zine.

Here’s the table of contents. The main article is, unsurprisingly, the first one:

Culture is an echo chamber: We all hate echo chambers in which a bunch of yahoos convince one another that they’re right. But, our fear of echo chambers can blind us to their important social role. Just take a look at Reddit.com…

In love with linked data: The Semantic Web requires a lot of engineering. So along comes this scrappy contender that says we ought to just make our data public and see what happens. Brilliant!

Too Big to Know: I worked on a book for a couple of years, and now it’s out. Yay?

Report from the DPLA platform: Surprisingly, I’m interim head of the project building the software platform for the Digital Public Library of America. Here’s what’s going on.

Bogus
Contest: #Stories
If history were written in hashtags.

3 Comments »

February 13, 2012

[2b2k] BibSoup is in beta

Congratulations to the Open Knowledge Foundation on the launch of BibSoup, a site where anyone can upload and share a bibliography. It’s a great idea, and an awesome addition to the developing knowledge ecosystem.

Be the first to comment »

January 4, 2012

Starting on the platform for the Digital Public Library of America

For the past 1.5 years or so, I’ve been co-director, along with Kim Dulin, of the Harvard Library Innovation Lab. Among the projects we’ve been working on is LibraryCloud, a multi library metadata server. (You can see it at work, running underneath ShelfLife, another of our projects, here.) Today the Digital Public Library of America announced that initial (and interim) development work on the DPLA platform will be done by the LibraryCloud team — Paul Deschner and Matthew Phillips — plus our Berkman friends, Daniel Collis-Puro and Sebastian Diaz. I’m the team leader, or whatever you call the person who knows the least. We’ll do this as openly as possible, relying upon the community to help at every phase, but this will be our core work during the first phase of the platform’s development, leading up to an April 26 DPLA Steering Committee meeting.

The DPLA platform will enable developers to write applications using the metadata (primarily about content hosted elsewhere) the DPLA will be aggregating.

We’re excited. Thrilled, actually.

5 Comments »

November 29, 2011

[2b2k] Curation without trucks

If users of a physical library could see the thousands of ghost trucks containing all the works that the library didn’t buy backing away from the library’s loading dock, the idea of a library would seem much less plausible. Rather than seeming like a treasure trove, it would look like a relatively arbitrary reduction.

It’s not that users or librarians think there is some perfect set (although it wasn’t so long ago that picking a shelf’s worth of The Great Books seemed not only possible but laudable). Everyone is pragmatic about this. Users understand that libraries make decisions based on a mix of supporting popular tastes and educating to preferred tastes: The Iliad is going to survive being culled even though it has far fewer annual check-outs than The Girl with the Dragon Tattoo. Curating is a practical art and libraries are good at it. But curating into a single collection that happens to fit within a library-sized building increasingly looks like a response to the weaknesses of material goods, rather than as an appropriate appreciation of their cultural value. Curation has always meant identifying the exceptions, but with the new assumption of abundance, curators look for exceptions to be excluded, rather than to be included. In the Age of the Net, we’re coming to believe that just about everything deserves to be in the library for one reason or another.

It seems to me there are two challenges here. The first is redeploying the skills of curators within a hyper-abundant world that supports multiple curations without cullings. That seems to me eminently possible and valuable. The second is cultivating tastes when there are so many more paths of least cognitive and aesthetic resistance. And that is a far more difficult, even implausible, challenge.

That is, our technology makes it easy to have multiple curations equally available, but our culture wants (has wanted?) some particular curations to have priority. Unless trucks are physically removing the works outside the preferred collection, how we are going to enforce our cultural preferences?

The easy solution is to give up on the attempt. The Old White Man’s canon is dead, and good riddance. But you don’t have to love old white men to believe that culture requires education — despite what Nikolas Sarkozy believes, we don’t “naturally” love complex works of art without knowing anything about their history or context — and that education requires taking some harder paths, rather than always preferring the easier, more familiar roads. I won’t argue further for this because it’s a long discussion and I have nothing to say that you haven’t already thought. So, for the moment take it as an hypothesis.

This I think makes clear what one of the roles of the DPLA (Digital Public Library of America) should be.

Ed Summers has warned that the DPLA needs to be different from the Web. If it is simply an index of what is already available, then it has not done its job. It seems to me that even if it curates a collection of available materials it has not done its job. It is not enough to curate. It is not even enough to curate in a webby way that enables users to participate in the process. Rather, it needs to be (imo) a loosely curated assemblage that is rich in helping us not only to find what is of value, but to appreciate the value of what we find. It can do that in the traditional ways — including items in the collection, including them in special lists, providing elucidations and appreciations of the items — as well as in non-traditional, crowd-sourced, hyperlinked ways. The DPLA needs to be rich and ever richer in such tools. The curated works should become ever more embedded into a network of knowledge and appreciation.

So, yes, part of the DPLA should be that it is a huge curated collection of collections. But curation now only has reliable value if it can bring us to appreciate why those curatorial decisions were made. Otherwise, it can seem as if we’re simply looking at that which the trucks left behind.

4 Comments »

November 19, 2011

[avignon] [2b2k] Robert Darnton on the history of copyright , open access, the dpla…

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

We begin with a report on a Ministerial meeting yesterday here on culture — a dialogue among the stakeholders on the Internet. [No users included, I believe.] All agreed on the principles proposed at Deauville: It is a multi-stakeholder ecosystem that complies with law. In this morning’s discussion, I was struck by the convergence: we all agree about remunerating copyright holders. [Selection effect. I favor copyright and remunerating rights holders, but not as the supreme or exclusive value.] We agree that there are more legal alternatives. We agree that the law needs to be enforced. No one argued with that. [At what cost?] And we all agree we need international cooperation, especially to fight piracy.

Now Robert Darnton, Harvard Librarian, gives an invited talk about the history of copyright.

Darnton: I am grateful to be here. And especially grateful you did not ask me to talk about the death of the book. The book is not dead. More books are being produced in print and online every year than in the previous year. This year, more than 1 million new books will be produced. China has doubled its production of books in the past ten years. Brazil has a booming book industry. Even old countries like the US find book production is increasing. We should not bemoan the death of the book.

Should we conclude that all is well in the world of books? Certainly not. Listen to the lamentations of authors, publishers, booksellers. They are clearly frightened and confused. The ground is shifting beneath their feet and they don’t know where to stake a claim. The pace of tech is terrifying. What took millennia, then centuries, then decades, now happens all the time. Homesteading in the new info ecology is made difficult by uncertainty about copyright and economics.

Throughout early modern Europe, publishing was dominated by guilds of booksellers and printers. Modern copyright did not exist, but booksellers accumulated privileges, which Condorcet objected to. These privileges (AKA patents) gave them the exclusive rights to reproduce texts, with the support of the state. The monarchy in the 17th century eliminated competitors, especially ones in the provinces, reinforcing the guild, thus gaining control of publishing. But illegal production throve. Avignon was a great center of privacy in the 18th century because it was not French. It was surrounded by police intercepting the illegal books. It took a revolution to break the hegemony of the Parisian guild. For two years after the Bastille, the French press enjoyed liberty. Condorcet and others had argued for the abolition of constraints on the free exchange of ideas. It was a utopian vision that didn’t last long.

Modern copyright began with the 1793 French copyright law that established a new model in Europe. The exclusive right to sell a text was limited to the author for lifetime + 10 years. Meanwhile, the British Statute of Anne in 1710 created copyright. Background: The stationers’ monopoly required booksellers — and all had to be members — to register. The oligarchs of the guild crushed their competitors through monopolies. They were so powerful that they provoked results even within the book trade. Parliament rejected the guild’s attempt to secure the licensing act in 1695. The British celebrate this as the beginning of the end of pre-publication censorship.

The booksellers lobbied for the modern concept of copyright. For new works: 14 years, renewable once. At its origin, copyright law tried to strike a balance between the public good and the private benefit of the copyright owner. According to a liberal view, Parliament got the balance right. But the publishers refused to comply, invoking a general principle inherent in common law: When an author creates work, he acquires an unlimited right to profit from his labor. If he sold it, the publisher owned it in perpetuity. This was Diderot’s position. The same argument occurred in France and England.

In England, the argument culminated in a 1774 Donaldson vs. Beckett that reaffirmed 14 years renewable once. Then we Americans followed in our Constitution and in the first copyright law in 1790 (“An act for the encouragement of learning”, echoing the British 1710 Act): 14 years renewable once.

The debate is still alive. The 1998 copyright extension act in the US was considerably shaped by Jack Valenti and the Hollywood lobby. It extended copyright to life + 70 (or for corporations: life + 95). We are thus putting most literature out of the public domain and into copyright that seems perpetual. Valenti was asked if he favored perpetual copyright and said “No. Copyright should last forever minus one day.”

This history is meant to emphasize the interplay of two elements that go right through the copyright debate: A principle directed toward the public gain vs. self-interest for private gain. It would be wrong-headed and naive to only assert the former. B ut to assert only the latter would be cynical. So, do we have the balance right today?

Consider knowledge and power. We all agree that patents help, but no one would want the knowledge of DNA to be exploited as private property. The privitization of knowledge has become an enclosure movement. Consider academic periodicals. Most knowledge first appears in digitized periodicals. The journal article is the principle outlet for the sciences, law, philosophy, etc. Journal publishers therefore control access to most of the knowledge being created, and they charge a fortune. The price of academic journals rose ten times faster than the rate of inflation in the 1990s. The J of Comparative Neurology is $29,113/year. The Brain costs $23,000. The average list price in chemistry is over $3,000. Most of the research was subsidized by tax payers. It belongs in the public domain. But commercial publishers have fenced off parts of that domain and exploited it. Their profit margins runs as high as 40%. Why aren’t they constrained by the laws of supply and domain? Because they have crowded competitors out, and the demand is not elastic: Research libraries cannot cancel their subscriptions without an uproar from the faculty. Of course, professors and students produced the research and provided it for free to the publishers. Academics are therefore complicit. They advance their prestige by publishing in journals, but they fail to understand the damage they’re doing to the Republic of Letters.

How to reverse this trend? Open access journals. Journals that are subsidized at the production end and are made free to consumers. They get more readers, too, which is not surprising since search engines index them and it’s easy for readers to get to them. Open Access is easy access, and the ease has economic consequences. Doctors, journalists, researchers, housewives, nearly everyone wants information fast and costless. Open Access is the answer. It is a little simple, but it’s the direction we have to take to address this problem at least in academic journals.

But the Forum is thinking about other things. I admire Google for its technical prowess, but also because it demonstrated that free access to info can be profitable. But it ran into problems when it began to digitize books and make them available. It got sued for alleged breach of copyright. It tried to settle by turning it into a gigantic business and sharing the profits with the authors and publishers who sued them. Libraries had provided the books. Now they’d have to buy them back at a price set by Google. Google was fencing off access to knowledge. A federal judge rejected it because, among other points, it threatened to create a monopoly. By controlling access to books, Google occupied a position similar to that of the guilds in London and Paris.

So why not create a library as great as anything imagined by Google, but that would make works available to users free of charge? Harvard held a workshop on Oct. 1 2010 to explore this. Like Condorcet, a utopian fantasy? But it turns out to be eminently reasonable. A steering committee, a secretariat, 6 workgroups were established. A year later we launched the Digital Public Library of America at a conference hosted by the major cultural institutions in DC, and in April in 2013 we’ll have a preliminary version of it.

Let me emphasize two points. 1. The DPLA will serve a wide an varied constituency throughout the US. It will be a force in education, and will provide a stimulus to the economy by putting knowledge to work. 2. It will spread to everyone on the globe. The DPLA’s technical infrastructure is being designed to be interoperable with Europeana, which is aggregating the digital collections of 27 companies. National digital libraries are sprouting up everywhere, even Mongolia. We need to bring them together. Books have never respected boundaries. Within a few decades, we’ll have worldwide access to all the books in the world, and images, recordings, films, etc.

Of course a lot remains to be done. But, the book is dead? Long live the book!

Q: It is patronizing to think that the USA and Europe will set the policy here. India and China will set this policy.

A: We need international collaboration. And we need an infrastructure that is interoperable.

1 Comment »

October 21, 2011

[dpla] second session

Maura Marx introduces Jill Cousins of Europeana who says that we all agree that we want to make the contents of libraries, museums and archives archives available for free. We agree on interoperability and open metadata. She encourages us to adopt the Europeana Data Model. Share our source code. Build our collections together. So, we’re starting with a virtual exhibition of migration of Europeans to American. The DPLA and Europeana will demonstrate the value of their combined collections — text and images — by digitizing material and making it available as an exhibition. (Maura thanks Bob Darnton for building European ties.)

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.


Maura Sullivan, president elect of the American Library Association, moderates a panel about visions of the DPLA. Each panelist gets 5-7 minutes


John Palfrey: It’s a bridge we’re building as we walk over it. But it has 5 aspects. 1. Digitizing projects. It’ll be a collection of collections. We should be digitizing in common ways with common formats. But, DPLA will also be: 2. Code. SourceForge for Libraries. Anyone can take and reuse it, including public libraries. 3. Metadata. That’s what makes info findable and usable. It’s the special sauce of librarians. But we haven’t done it yet. We need open access to metadata. 4. Tools and services that ride on top of a common platform. E.g., extraMuros, Scannebagos. 5. Community.


Peggy Rudd, Texas State Library and Archives Commission. We want to see someone walking down the street with a cellphone who says, “I’m going to DPLA it.” We should take as a guiding idea that all people in the country ought to have access to the infrastructure of ideas. We have to think about access. Those of us in public libraries are going to be the digital literacy corps. Public libraries are going to be the institutions that can ensure that people can discover things and will help people evaluate what they find, ensuring what they find is relevant, and help people get the most out of the DPLA.


Brewster Kahle, The Internet Archive. I grew up in a paper world. But I believe the Archivist is right: If it’s not online, it doesn’t exist. There are now two large scale digital library projects in the US. Ten million books are available from a commercial source, and 2M that are public (at OpenLibrary.org). But let’s step back and see where we want to be: Lots of publishers and authors who are paid; a diversity of libraries; everyone can be a reader, no matter what language, proclivities, disabilities. Let’s go and get 10M ebooks. 2M public domain (free), 7M out of print (digitized to be lent), 1M in print (buy ebook and lend them). Libraries ought to ebooks and circulate them, one loan at a time per one book. DPLA ought to help libraries buy new eBooks to lend them, as well as scanning the core 10M book collection, and enable al libraries get the digital collections. At this point, a 10M ebook collections requires about $30K of computers, which is within the budget of many libraries. For this, we would get universal access to all knowledge. How do we stay on track? Follow the money: is the money being well spent. And follow the bits: the bits should be put in many places. “Together we can build a digital America that is free to all.”


Amanda French begins with John Donne, “Sunrising.” [I am here heavily paraphrasing!] For most, the sun rising is a beginning, but for lovers it is an ending. The unruly sun of the digital text is rising, calling us to work, whereas I would rather snuggle in bed with a book. Love can exist in a commercial relationship, but that’s not ideal. I would like a library that supports me in all my moods, from contemplation to raucous sociality. We need proof of love. Physical libraries manifest that love. The DPLA must manifest itself as more than a web site, many quiet and generous services to readers, developers…technical and social. While I agree that if it isn’t online, it doesn’t exist, but if it’s only only online, it only half exists. And I want a physical building. Not just a server center. [Again: I've poorly paraphrased.]


Jill Cousins, Europeana. We want the DPLA because we get access to your stuff. [Laughter] But DPLA can improve on Europeana with open data, Open Source, Open Licensing. Also, we should be interopable. Our new strategic plan has four aspects. 1. Aggregating content as an trusted source. 2. Facilitating, supporting cultural eritage. 3. Distributing: Wherever people are. 4. Engaging: New ways to participate in cultural heritage. Europeana currentlu has 20M items, multiple languages. I’m particularly interested in the APIs so material can be distributed to where people will use it. (She points to content about the US that is in their distributed collection.) To facilitate: Labeling content so users know it’s in the public domain. What’s in the PD in analog form ought to stay in the PD in digital form. Engage: Cultivate new way for users to participate in their cultural heritage. One project: People are asked to bring their memorabilia from WWI. So, why DPLA: We are the generation that can give acccess to the analog past. If we don’t digitize it and put it online, will our kids?


Carl Malamud. When I think of the DPLA, I think of the Hoover Dam and the Golden Gate Bridge. There’s a tremendous reservoir of knowledge waiting to be tapped. Our Internet is flooded with only certain types of knowledge, and other types are not available to all. E.g., our law and policies — the operating system of our society — are not openly available because private fences have enclosed. E.g., if you’re a creator, you draw on imagery that has accumulated over thousands of years. Creative workers must stand on the shoulders of giants. But much of that image is locked up in for-profit corps that have built walls around public domain material. Even the Smithsonian only allows its images to be used by paying for them. We already have beautiful museums and bottomless libraries. What if the DPLA created a common reservoir that we could tap into. What if the Hathi Trust put everything that have into a common pool. Another metaphor: A bridge that connects our capitol to the rest of the country. DC is a vast storehouse. Most of the resources are hidden. We need public works projects for knowledge. A national digitization project, a decade long. Deploy the Internet Core of Engineers. “If a self-appointed librarian in an old church can publish 2M books, why can’t our government do more?


[I had to see a man about a dog, and missed a couple of questions.]


Q: How do we transform the use of public libraries?
Peggy: They have to evolve, and many are evolving already. E.g., user-created content. 46% of low-income families don’t have computers or Internet access.


Q: Bandwidth is a critical issue, particularly in rural areas. I hope that the DPLA realizes it’s going to have data-heavy materials. How are we going to build bandwidth to the public libraries?
Peggy: I’m happy to see the Gates Foundation here. They’ve worked with local libraries to provide and maintain bandwidth. 5mb is not enough when kids swarm in after school.


Q: Imagine an Ecuadoran American mother who is a part time student. She belongs to a lot of communities. I want to make sure that the coding of the DPLA recognizes that we each live in multiple communities.
Peggy: We all agree.


Q: First, in 1991 a White House conf was talking about not just scanning, but enable people to send in their materials (e.g., super8 family movies) that could be digitized. Second, DPLA has a huge potential for freeing up resources at the local library so it can spend its resources on customizing content to what that community needs, or let the person customize the library for herself.


Q: How does an ordinary person get involved in DPLA right now. Lobbying?
John: Lots of ways. Mobilization counts. The effect on local libraries needs to be explained; no one here thinks or wants the DPLA to hurt local public libraries. That’s a crazy thought. But that needs to be explained. I would be so sorry if this project led to the closing of a single library. And, yes, I think we should have a way for individuals to donate. How can you get involved in the setting up of this project: Deciding what the DPLA is an open process. There are six workstreams. Today is meant in part as an invitation to join in those workstreams. There will be meetings over the next 18 months; the meetings will be open. Come. We need people to build with what we create. We need people to think of new use cases. In April 2013 when we come together for the launch, if there are ten more people attending, that will be a sign of success.


Q: What do you have in the collection for children, 0-8? Why will a parent want to use the DPLA?
John: The DPLA needs to create a common infrastructure so people can create libraries and services out of the combined collection. But as a parent of a six and 9 year old, we’ll keep buying paper books and reading to our kids. The DPLA is not a replacement.
Peggy: Univ. of Texas in Arlington did a study at what engages students in the study of the history of Texas. Students perform better on tests if they had a greater interaction with real documents. We’re bringing history to the classrooms.
Carl: The Encyclopedia of Life has pictures of bugs, etc. And the Smithsonian has a great online resource [didn't catch it], and the net thing the kid will want to do is visit the Smithsonian.
Amanda: If it isn’t online people don’t know it exists. If they know …[Ack. Lost the rest of this post. Noooooo]

1 Comment »

[dpla] First session

Moderated by John Palfrey.

Deanna Marcum of the Library of Congress says the LC has 148M objects and has digitized 28M of them. [I may have gotten that last number wrong. Sorry.] The LC wants to make these resources as available as possible. “That is what brings us to the table of the DPLA. It seems to be the type of organization that will help us fulfill our mission in a very important way.” [Tying the DPLA to the LC's mission is a big deal.]

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Deanna says that from the beginning, the Librarian of Congress, James Billington, has asked how the LC can serve libraries better. The answer consistently has been: We want your content where we are, not where you are. This was pre-Net, so they looked to CD ROMs, digitizing collections starting in the early 1990s, beginning with materials useful to K-12. They checked in with the 44 pilots and were amazed to find it was useful all the way down to third grade students were making “incredibly innovative” uses of this digital content. In 1995, Congress said that it’d match private funds 1:3 (Congress pays $1 for each 3 raised) for digitization efforts. The LC began to think about what in its collections should be digitized first. Sloan funded digitizing of public domain works. Those efforts continue.

Susan Hildreth is director of the Institute of Museum and Library Services, a federal grant-making agency. She wonders what resources already exist that the DPLa can use, and which resources need to be created. This is vital the IMLS’ contribution to the effort. The IMLS already has invested heavily in digitization projects. Also: metadata collection and cleanup programs. Also: training librarians. Also: conversations on these topics. So, there are already digitized items, best practices and policies, etc. for digital collections. Also, IMLS has reports of 20 years of international discussions about what digital libraries can be. And, some lessons learned: 1. Collaboration is key to long success in digitization. 2. The traditional relation between info providers and consumers is changing. 3. Digital libraries can reduce administrative costs, although we’re just at the beginning of this.

Also, Susan says we should learn some lessons from the IMLS: Support interoperability and the preservation of digital resources. Make it sustainable. Find new ways to measure the impact. Ultimately how will this make a difference to the person going on the Web to find information? The IMLS can be strategic in the DPLA’s efforts. [We like the "strategic" commitment.]

John Palfrey reinforces her statement about the excitement this is generating among librarian students.

David Ferriero, the Archivist [coolest title ever], talks. He comes to this position after heading the NY Public Library. He explains that the National Archives is the nation’s record keeper. For all federal agencies, and “courtesy preservation” for Congress. It began only in 1935. The records go back to the Continental Congress, and include White House tweets. 12B pages of textual records. Billions of electronic records, which is the fastest growing area. 8M emails from Reagan, 200+|M from the GW Bush era. And, as Bush tells David, “Not one of those is mine.” He wants every item in the Archives to be online. He remembers discussions with librarians in which they worried about how to get students to use paper. “Get over it.”

The massive amount of material they have has made the Archives “rather creative” in getting out. E.g., the Citizen Archivist program to give opportunities to the people to help digitize and process records. Docs Teach is online, loaded with lesson plans, etc.

When he was at NYPL, they worked with Google to digitize 1M works, and David saw how it has transforms scholarship. In Dec. 2009, Pres. Obama signed a declassification order requiring the Archives to review and declassify. They’ve gone through 1M pages and have release 91% to the public shelves. The CIA “finally caved on the oldest secret documents” — German docs on creating secret ink. This happened because the Archivist staff used Google Books to discover that the ink formulas had been published in 1931.

Q: Accessibility and findability? Not enough to simply put things online.
A: Deanna: It’s important. But you’re looking at three people who don’t know how to do this.
A: David: Josh Greenberg taught me that we should talk about where the people are and get our stuff out there. That’s why we use Youtube and Flickr. It’s a problem for the Archives because our records are so large and complex. Plus, kids today can’t read cursive. So we’re going to be creating ways for the public to help us transcribe cursive docs.
A: Susan: It’s a broad issue, including making our materials available to those with disabilities, in multiple languages, etc. IMLS is interested in supporting platforms for effective discovery.
A: David: Serendipity is important.

Q: Director of the Smithsonian Institutional Libraries: We also are very interested in participating in the DPLA with our 137M objects (although 124M are natural history specimens, so how many mosquitoes do we want in the DPLA?). But we have 6.4M digitized objects and are in a unique position to pull in museum, library and archive objects. We’re eager to continue to cooperate.

Q: Are there mechanisms in place to avoid reverse engineering of CIA documents.
A: The Archivist does not have the authority to release. We just facilite the process.
Q: Are you going to do more?
A: We’ve done a million. There are 400M to go. We have a deadline in 2013. I hosted a meeting about the priorities and the room was evenly split between releasing the JFK assassination docs and UFOs.

Q: [British Library] One of the real challenges is the difference between a digital library and a wonderful but confusing random set of resources. Public-private partnerships are essential. And we have just opened up all our metadata on a CCEuro license. No one can know what this will be used for, and that is its value. Also, there’s a challenge finding and developing modern librarians/curators.

Q: John Mayer: Imagine it’s 2016 and all your collections have been digitized. How does society improve once that’s in place. What’s the sf scenario of the DPLA?
A: Deanna: If we assume benefit from having access to info resources — better decisions, better understanding where they come from and where they’re going, unerstand world cultures better — we want to make these resources available any way they want. That’s what librarians have always dreamed about and we finally have a mechanism for doing that. American citizens have paid for these resources with their tax dollars.
A: David: Better informed citizenry. Hold our government accountable. Understand our future by learning from our history.
A: Susan: If all is digitized, what happens to our physical facilities. By providing all that info, it will create a greater need and desire for people to work together, in the virtual and real worlds. It’s a very exciting and liberating future. And if we have all that data, we have to have strong connectivity to our homes, schools, libraries…

Q: Bob Darnton: Many of the questions have been testimonials. Wonderful! We rejected the name “National Digital Library” because there’s nothing national about it. Getting bigger means getting more international, and that is certainly going to happen. The national library director of France has expressed support. So has Europeana. This support is a movement that goes back to the international Republic of Letters. We’re getting the feeling we can make real a dream at the founding of this country.

[It is so ineffable cool and inspiriting to have these great institutions sharing a stage and a vision.]

1 Comment »

[dpla] DPLA plenary

I’m at what is in effect the public launch of the Digital Public Library of America — “in effect” because the DPLA has been open to all from the beginning. But today we’re in the theater of the National ARchives and have just been greeted by the Archivist of the United States, David Ferriero.

I spent yesterday at the “workstream” meetings of the DPLA. The openness of the DPLA has meant that there has been no moment at which all have agreed on precisely what the DPLA should be. Yesterday could have been a day that had people walking apart from one another or walking toward a center as yet to be fully located. It was a day of walking toward that emergent center. Given the continuing significant differences in the group, my sense that the convergence was enabled by a shared sense of the value of what we could build, by shared interests and backgrounds (a bunch of librarians and admirers of librarians), and by the carefully crafting of the day’s events and processes. (That last goes to the credit of the Berkman Center.)

I am very excited. (I’m also at maximum stress because I am giving a 8.5 minute demo this afternoon…talking to a screencast I did in my hotel room last night, leaving no room for temporal variance. You can see the live prototype here.)

Doron Weber of the Sloane Foundation is now briefly recounting the history of the DPLA, which started with a workshop a year ago. Doron today announced the beginning of a “two year grass roots effort” to build the DPLA. The DPLA is intended to be a platform for discovering our rich shared cultural heritage he says (approximately). He sketches a very broad agenda, including discovering collections, building them, partnering with other nations, sharing metadata, and exploring doing some form of collective licensing of in-copyright material. (Excellent. I personally don’t want this to become the Digital Public Library of Jane Austen.)

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Doron announces that Sloane and Arcadia are each contributing $2.5M to support the DPLA over the next 18 months. Woohoo! Peter Baldwin from Arcadia gives a gracious short talk.

Be the first to comment »

« Previous Page | Next Page »