Joho the BlogOctober 21, 2011 - Joho the Blog

October 21, 2011

[dpla] second session

Maura Marx introduces Jill Cousins of Europeana who says that we all agree that we want to make the contents of libraries, museums and archives archives available for free. We agree on interoperability and open metadata. She encourages us to adopt the Europeana Data Model. Share our source code. Build our collections together. So, we’re starting with a virtual exhibition of migration of Europeans to American. The DPLA and Europeana will demonstrate the value of their combined collections — text and images — by digitizing material and making it available as an exhibition. (Maura thanks Bob Darnton for building European ties.)

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Maura Sullivan, president elect of the American Library Association, moderates a panel about visions of the DPLA. Each panelist gets 5-7 minutes

John Palfrey: It’s a bridge we’re building as we walk over it. But it has 5 aspects. 1. Digitizing projects. It’ll be a collection of collections. We should be digitizing in common ways with common formats. But, DPLA will also be: 2. Code. SourceForge for Libraries. Anyone can take and reuse it, including public libraries. 3. Metadata. That’s what makes info findable and usable. It’s the special sauce of librarians. But we haven’t done it yet. We need open access to metadata. 4. Tools and services that ride on top of a common platform. E.g., extraMuros, Scannebagos. 5. Community.

Peggy Rudd, Texas State Library and Archives Commission. We want to see someone walking down the street with a cellphone who says, “I’m going to DPLA it.” We should take as a guiding idea that all people in the country ought to have access to the infrastructure of ideas. We have to think about access. Those of us in public libraries are going to be the digital literacy corps. Public libraries are going to be the institutions that can ensure that people can discover things and will help people evaluate what they find, ensuring what they find is relevant, and help people get the most out of the DPLA.

Brewster Kahle, The Internet Archive. I grew up in a paper world. But I believe the Archivist is right: If it’s not online, it doesn’t exist. There are now two large scale digital library projects in the US. Ten million books are available from a commercial source, and 2M that are public (at But let’s step back and see where we want to be: Lots of publishers and authors who are paid; a diversity of libraries; everyone can be a reader, no matter what language, proclivities, disabilities. Let’s go and get 10M ebooks. 2M public domain (free), 7M out of print (digitized to be lent), 1M in print (buy ebook and lend them). Libraries ought to ebooks and circulate them, one loan at a time per one book. DPLA ought to help libraries buy new eBooks to lend them, as well as scanning the core 10M book collection, and enable al libraries get the digital collections. At this point, a 10M ebook collections requires about $30K of computers, which is within the budget of many libraries. For this, we would get universal access to all knowledge. How do we stay on track? Follow the money: is the money being well spent. And follow the bits: the bits should be put in many places. “Together we can build a digital America that is free to all.”

Amanda French begins with John Donne, “Sunrising.” [I am here heavily paraphrasing!] For most, the sun rising is a beginning, but for lovers it is an ending. The unruly sun of the digital text is rising, calling us to work, whereas I would rather snuggle in bed with a book. Love can exist in a commercial relationship, but that’s not ideal. I would like a library that supports me in all my moods, from contemplation to raucous sociality. We need proof of love. Physical libraries manifest that love. The DPLA must manifest itself as more than a web site, many quiet and generous services to readers, developers…technical and social. While I agree that if it isn’t online, it doesn’t exist, but if it’s only only online, it only half exists. And I want a physical building. Not just a server center. [Again: I’ve poorly paraphrased.]

Jill Cousins, Europeana. We want the DPLA because we get access to your stuff. [Laughter] But DPLA can improve on Europeana with open data, Open Source, Open Licensing. Also, we should be interopable. Our new strategic plan has four aspects. 1. Aggregating content as an trusted source. 2. Facilitating, supporting cultural eritage. 3. Distributing: Wherever people are. 4. Engaging: New ways to participate in cultural heritage. Europeana currentlu has 20M items, multiple languages. I’m particularly interested in the APIs so material can be distributed to where people will use it. (She points to content about the US that is in their distributed collection.) To facilitate: Labeling content so users know it’s in the public domain. What’s in the PD in analog form ought to stay in the PD in digital form. Engage: Cultivate new way for users to participate in their cultural heritage. One project: People are asked to bring their memorabilia from WWI. So, why DPLA: We are the generation that can give acccess to the analog past. If we don’t digitize it and put it online, will our kids?

Carl Malamud. When I think of the DPLA, I think of the Hoover Dam and the Golden Gate Bridge. There’s a tremendous reservoir of knowledge waiting to be tapped. Our Internet is flooded with only certain types of knowledge, and other types are not available to all. E.g., our law and policies — the operating system of our society — are not openly available because private fences have enclosed. E.g., if you’re a creator, you draw on imagery that has accumulated over thousands of years. Creative workers must stand on the shoulders of giants. But much of that image is locked up in for-profit corps that have built walls around public domain material. Even the Smithsonian only allows its images to be used by paying for them. We already have beautiful museums and bottomless libraries. What if the DPLA created a common reservoir that we could tap into. What if the Hathi Trust put everything that have into a common pool. Another metaphor: A bridge that connects our capitol to the rest of the country. DC is a vast storehouse. Most of the resources are hidden. We need public works projects for knowledge. A national digitization project, a decade long. Deploy the Internet Core of Engineers. “If a self-appointed librarian in an old church can publish 2M books, why can’t our government do more?

[I had to see a man about a dog, and missed a couple of questions.]

Q: How do we transform the use of public libraries?
Peggy: They have to evolve, and many are evolving already. E.g., user-created content. 46% of low-income families don’t have computers or Internet access.

Q: Bandwidth is a critical issue, particularly in rural areas. I hope that the DPLA realizes it’s going to have data-heavy materials. How are we going to build bandwidth to the public libraries?
Peggy: I’m happy to see the Gates Foundation here. They’ve worked with local libraries to provide and maintain bandwidth. 5mb is not enough when kids swarm in after school.

Q: Imagine an Ecuadoran American mother who is a part time student. She belongs to a lot of communities. I want to make sure that the coding of the DPLA recognizes that we each live in multiple communities.
Peggy: We all agree.

Q: First, in 1991 a White House conf was talking about not just scanning, but enable people to send in their materials (e.g., super8 family movies) that could be digitized. Second, DPLA has a huge potential for freeing up resources at the local library so it can spend its resources on customizing content to what that community needs, or let the person customize the library for herself.

Q: How does an ordinary person get involved in DPLA right now. Lobbying?
John: Lots of ways. Mobilization counts. The effect on local libraries needs to be explained; no one here thinks or wants the DPLA to hurt local public libraries. That’s a crazy thought. But that needs to be explained. I would be so sorry if this project led to the closing of a single library. And, yes, I think we should have a way for individuals to donate. How can you get involved in the setting up of this project: Deciding what the DPLA is an open process. There are six workstreams. Today is meant in part as an invitation to join in those workstreams. There will be meetings over the next 18 months; the meetings will be open. Come. We need people to build with what we create. We need people to think of new use cases. In April 2013 when we come together for the launch, if there are ten more people attending, that will be a sign of success.

Q: What do you have in the collection for children, 0-8? Why will a parent want to use the DPLA?
John: The DPLA needs to create a common infrastructure so people can create libraries and services out of the combined collection. But as a parent of a six and 9 year old, we’ll keep buying paper books and reading to our kids. The DPLA is not a replacement.
Peggy: Univ. of Texas in Arlington did a study at what engages students in the study of the history of Texas. Students perform better on tests if they had a greater interaction with real documents. We’re bringing history to the classrooms.
Carl: The Encyclopedia of Life has pictures of bugs, etc. And the Smithsonian has a great online resource [didn’t catch it], and the net thing the kid will want to do is visit the Smithsonian.
Amanda: If it isn’t online people don’t know it exists. If they know …[Ack. Lost the rest of this post. Noooooo]

1 Comment »

[dpla] First session

Moderated by John Palfrey.

Deanna Marcum of the Library of Congress says the LC has 148M objects and has digitized 28M of them. [I may have gotten that last number wrong. Sorry.] The LC wants to make these resources as available as possible. “That is what brings us to the table of the DPLA. It seems to be the type of organization that will help us fulfill our mission in a very important way.” [Tying the DPLA to the LC’s mission is a big deal.]

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Deanna says that from the beginning, the Librarian of Congress, James Billington, has asked how the LC can serve libraries better. The answer consistently has been: We want your content where we are, not where you are. This was pre-Net, so they looked to CD ROMs, digitizing collections starting in the early 1990s, beginning with materials useful to K-12. They checked in with the 44 pilots and were amazed to find it was useful all the way down to third grade students were making “incredibly innovative” uses of this digital content. In 1995, Congress said that it’d match private funds 1:3 (Congress pays $1 for each 3 raised) for digitization efforts. The LC began to think about what in its collections should be digitized first. Sloan funded digitizing of public domain works. Those efforts continue.

Susan Hildreth is director of the Institute of Museum and Library Services, a federal grant-making agency. She wonders what resources already exist that the DPLa can use, and which resources need to be created. This is vital the IMLS’ contribution to the effort. The IMLS already has invested heavily in digitization projects. Also: metadata collection and cleanup programs. Also: training librarians. Also: conversations on these topics. So, there are already digitized items, best practices and policies, etc. for digital collections. Also, IMLS has reports of 20 years of international discussions about what digital libraries can be. And, some lessons learned: 1. Collaboration is key to long success in digitization. 2. The traditional relation between info providers and consumers is changing. 3. Digital libraries can reduce administrative costs, although we’re just at the beginning of this.

Also, Susan says we should learn some lessons from the IMLS: Support interoperability and the preservation of digital resources. Make it sustainable. Find new ways to measure the impact. Ultimately how will this make a difference to the person going on the Web to find information? The IMLS can be strategic in the DPLA’s efforts. [We like the “strategic” commitment.]

John Palfrey reinforces her statement about the excitement this is generating among librarian students.

David Ferriero, the Archivist [coolest title ever], talks. He comes to this position after heading the NY Public Library. He explains that the National Archives is the nation’s record keeper. For all federal agencies, and “courtesy preservation” for Congress. It began only in 1935. The records go back to the Continental Congress, and include White House tweets. 12B pages of textual records. Billions of electronic records, which is the fastest growing area. 8M emails from Reagan, 200+|M from the GW Bush era. And, as Bush tells David, “Not one of those is mine.” He wants every item in the Archives to be online. He remembers discussions with librarians in which they worried about how to get students to use paper. “Get over it.”

The massive amount of material they have has made the Archives “rather creative” in getting out. E.g., the Citizen Archivist program to give opportunities to the people to help digitize and process records. Docs Teach is online, loaded with lesson plans, etc.

When he was at NYPL, they worked with Google to digitize 1M works, and David saw how it has transforms scholarship. In Dec. 2009, Pres. Obama signed a declassification order requiring the Archives to review and declassify. They’ve gone through 1M pages and have release 91% to the public shelves. The CIA “finally caved on the oldest secret documents” — German docs on creating secret ink. This happened because the Archivist staff used Google Books to discover that the ink formulas had been published in 1931.

Q: Accessibility and findability? Not enough to simply put things online.
A: Deanna: It’s important. But you’re looking at three people who don’t know how to do this.
A: David: Josh Greenberg taught me that we should talk about where the people are and get our stuff out there. That’s why we use Youtube and Flickr. It’s a problem for the Archives because our records are so large and complex. Plus, kids today can’t read cursive. So we’re going to be creating ways for the public to help us transcribe cursive docs.
A: Susan: It’s a broad issue, including making our materials available to those with disabilities, in multiple languages, etc. IMLS is interested in supporting platforms for effective discovery.
A: David: Serendipity is important.

Q: Director of the Smithsonian Institutional Libraries: We also are very interested in participating in the DPLA with our 137M objects (although 124M are natural history specimens, so how many mosquitoes do we want in the DPLA?). But we have 6.4M digitized objects and are in a unique position to pull in museum, library and archive objects. We’re eager to continue to cooperate.

Q: Are there mechanisms in place to avoid reverse engineering of CIA documents.
A: The Archivist does not have the authority to release. We just facilite the process.
Q: Are you going to do more?
A: We’ve done a million. There are 400M to go. We have a deadline in 2013. I hosted a meeting about the priorities and the room was evenly split between releasing the JFK assassination docs and UFOs.

Q: [British Library] One of the real challenges is the difference between a digital library and a wonderful but confusing random set of resources. Public-private partnerships are essential. And we have just opened up all our metadata on a CCEuro license. No one can know what this will be used for, and that is its value. Also, there’s a challenge finding and developing modern librarians/curators.

Q: John Mayer: Imagine it’s 2016 and all your collections have been digitized. How does society improve once that’s in place. What’s the sf scenario of the DPLA?
A: Deanna: If we assume benefit from having access to info resources — better decisions, better understanding where they come from and where they’re going, unerstand world cultures better — we want to make these resources available any way they want. That’s what librarians have always dreamed about and we finally have a mechanism for doing that. American citizens have paid for these resources with their tax dollars.
A: David: Better informed citizenry. Hold our government accountable. Understand our future by learning from our history.
A: Susan: If all is digitized, what happens to our physical facilities. By providing all that info, it will create a greater need and desire for people to work together, in the virtual and real worlds. It’s a very exciting and liberating future. And if we have all that data, we have to have strong connectivity to our homes, schools, libraries…

Q: Bob Darnton: Many of the questions have been testimonials. Wonderful! We rejected the name “National Digital Library” because there’s nothing national about it. Getting bigger means getting more international, and that is certainly going to happen. The national library director of France has expressed support. So has Europeana. This support is a movement that goes back to the international Republic of Letters. We’re getting the feeling we can make real a dream at the founding of this country.

[It is so ineffable cool and inspiriting to have these great institutions sharing a stage and a vision.]

1 Comment »

[dpla] DPLA plenary

I’m at what is in effect the public launch of the Digital Public Library of America — “in effect” because the DPLA has been open to all from the beginning. But today we’re in the theater of the National ARchives and have just been greeted by the Archivist of the United States, David Ferriero.

I spent yesterday at the “workstream” meetings of the DPLA. The openness of the DPLA has meant that there has been no moment at which all have agreed on precisely what the DPLA should be. Yesterday could have been a day that had people walking apart from one another or walking toward a center as yet to be fully located. It was a day of walking toward that emergent center. Given the continuing significant differences in the group, my sense that the convergence was enabled by a shared sense of the value of what we could build, by shared interests and backgrounds (a bunch of librarians and admirers of librarians), and by the carefully crafting of the day’s events and processes. (That last goes to the credit of the Berkman Center.)

I am very excited. (I’m also at maximum stress because I am giving a 8.5 minute demo this afternoon…talking to a screencast I did in my hotel room last night, leaving no room for temporal variance. You can see the live prototype here.)

Doron Weber of the Sloane Foundation is now briefly recounting the history of the DPLA, which started with a workshop a year ago. Doron today announced the beginning of a “two year grass roots effort” to build the DPLA. The DPLA is intended to be a platform for discovering our rich shared cultural heritage he says (approximately). He sketches a very broad agenda, including discovering collections, building them, partnering with other nations, sharing metadata, and exploring doing some form of collective licensing of in-copyright material. (Excellent. I personally don’t want this to become the Digital Public Library of Jane Austen.)

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Doron announces that Sloane and Arcadia are each contributing $2.5M to support the DPLA over the next 18 months. Woohoo! Peter Baldwin from Arcadia gives a gracious short talk.

Comments Off on [dpla] DPLA plenary