April 6, 2016

In defense of personalization

Mills Baker defends personalization on the right grounds. In a brilliant and brilliantly written post, he maintains that the personalization provided by sites does at scale what we do in the real world to enable conversations: through multiple and often subtle signals, we let an interlocutor know where our interests and beliefs are similar enough that we are able to safely express our differences.

Digression: This is at the heart of our cultural fear of echo chambers, in my opinion. Conversation consists of iteration on small differences based on an iceberg of agreement. Every conversation inadvertently reinforces the beliefs that enable it to go forward. Likewise, understanding is contextual, assimilating the novel to the familiar, thus reinforcing that context by making it richer and more coherent. But our tradition has taught us that Reason requires us to be open to all ideas, ready to undo the entire structure of our beliefs. Reason, if applied purely, would thus make conversation, understanding, and knowledge impossible. In fearing echo chambers, we are running from the fact that understanding and conversation share the basic elements of echo chambers. I’ll return to this point in a later post sometime…

I love everything about Mills’ post except his under-valuing of concerns about the power personalization has over us on-line. Yes, personalization is a requirement in a scaled environment. Yes, the right comparison is between our new info flows and our old info trickles. But…

…Miles does not fully confront the main complaint: our interests and the interests of the commercial entities that are doing the personalizing do not fully coincide. Facebook has an economic motivation to get us to click more and to exit Facebook sessions eager to return for more. Facebook thus has an economic interest in showing us personalized clickbait, and to filter our feeds toward happiness rather than hey-my-cat-died-yesterday posts.

In one sense, this is entirely Mills’ point. He wants designers to understand the positive role personalization has always played, so they can reinstate that role in software that works for us. He thinks that getting this right is the responsibility of the software for “Most users do not want the ‘control’ of RSS and Twitter lists and blocking, muting, and unfollowing their fellows.” Thus the software needs to learn from the clues left inadvertently by users. (I’d argue that there’s also room for better designed control systems. I bet Mills agrees, because how could anyone argue against better designed anything?)

But in my view he too casually dismisses the responsibility and culpability of some of the most important sites when he writes:

The idea that personalization is about corporate or political control is an emotionally satisfying but inaccurate one.

If we take “personalization” in the insightful and useful way he has defined it, then sure. But when people rail against personalization they are thinking about the algorithmic function performed by commercial entities. And those entities have a massive incentive—exercised by companies like Facebook—to personalize the flow of information toward users as consumers rather than as persons.

I think.



Thanks to Dave Birk for pointing me to Mills’ post.


March 8, 2016

Making library miscellaneousness awesome

Sitterwerk Art Library in St. Gallen, Switzerland, has 25,000 items on its shelves in no particular order. This video explains why that is a brilliant approach. And then the story just gets better and better.

Werkbank from Astrom / Zimmer on Vimeo.

That the shelves have no persistent order doesn’t mean they have no order. Rather, works are reshelved by users in the clusters the users have created for their research. All the items have RFID tags in them, and the shelves are automatically scanned so that the library can always tell users where items are located.

As a result, if you look up a particular item, you will see it surrounded by works that some other user thought were related to it in some way. This creates a richer browsing experience because it is shaped and reshaped by how its community of users sees the items’ inter-relationships.

The library has now installed Werkbank, which is a plain old table where you can spread out a pile of books and do your research. But, unlike truly plain old tables, this one combines RFID sensors and cameras with recognition software so it knows which works you’ve put on the table and how you’ve organized them. Werkbench notes those associations, and stores them, creating a rich network of related works.

It also lets the individual save a research set, and even compile a booklet documenting those items, with notes. It can be printed on the spot and taken home … or put into the shelves as a user-generated lib guide.

This is awesome.

Here’s a bit more about it:

…the new table sports a grid of 12 an­ten­nas. It also has two cam­eras at­tached: one for scan­ning the tab­letop and through cus­tom image re­cog­ni­tion soft­ware de­term­ine the exact po­s­i­tion and ro­ta­tion of books; one for mak­ing high-res­ol­u­tion scans of pages, notes or ob­jects not yet in the Sit­ter­werk cata­logue. Just like be­fore, the new server and its in­ter­face provides a real-time di­gital ren­der­ing of the table and its con­tents, but in two di­men­sions in­stead of one. It also lets you at­tach scans, pho­tos and texts to in­di­vidual ob­jects, and to the vir­tual table it­self. Once you save your col­lec­tion, it merges with a grow­ing net­work of other col­lec­tions, books, ma­ter­i­als, thoughts and people

Anthon Astrom tells me that the project currently runs against an internal API, and they are planning to create a public API at some point. That way, the world can benefit from what Sitterwerk’s users are teaching it.



At the Harvard Library Innovation Lab, we wanted to do something that touches on some elements of this. With 73 libraries and 13 million items in Harvard Library it never even crossed our minds to install continuous RFID scanners in the stacks. So, our StackLife project and the LibraryCloud platform underneath it wanted simply to record which books were checked out with others, on the grounds that those clusters often have meaning. But, Harvard cyber-security researchers warned that this could be used to identify who took the books out. We thought about ways of smudging the data, and about making it opt-in, but it was not a fight we could win at that point. Werkbank might have the same issues when recording clusters but because it’s an art library, there may be less concern about the government demanding to know who was researching The Scream, Delacroix’s Liberty Leading the People, and Guernica because that person is clearly up to no good.

In any case the Sitterwerk library and Werkbank have far exceeded our imagination. More than that: it’s real. Awesomely real.


January 13, 2016

Perfect Eavesdropping

Suppose a laptop were found at the apartment of one of the perpetrators of last year’s Paris attacks. It’s searched by the authorities pursuant to a warrant, and they find a file on the laptop that’s a set of instructions for carrying out the attacks.

Thus begins Jonathan Zittrain‘s consideration of an all-too-plausible hypothetical. Should Google respond to a request to search everyone’s gmail inboxes to find everyone to whom the to-do list was sent ? As JZ says, you can’t get a warrant to search an entire city, much less hundreds of millions of inboxes.

But, while this is a search that sweeps a good portion of the globe, it doesn’t “listen in” on any mail except for that which contains a precise string of words in a precise order. What happens next would depend upon the discretion of the investigators.

JZ points out that Google already does something akin to this when it searches for inboxes that contain known child pornography images.

JZ’s treatment is even handed and clear. (He’s a renown law professor. He knows how to do these things.) He discusses the reasons pro and con. He comes to his own personal conclusion. It’s a model of clarity of exposition and reasoning.

I like this article a lot on its own, but I find it especially fascinating because of its implications for the confused feeling of violation many of us have when it’s a computer doing the looking. If a computer scans your emails looking for a terrorist to-do list, has it violated your sense of privacy? If a robot looks at you naked, should you be embarrassed? Our sense of violation is separable from our legal and moral right to privacy question, but the two meanings often get mixed up in such discussions. Not in JZ’s, but often enough.

November 6, 2015

More cracks in the enormous dam in the river of scholarship [#blockThatMetaphor]

Here’s the TL;DR (also known as a well-written lead paragraph, by Scott Jaschik):

All six editors and all 31 editorial board members of Lingua, one of the top journals in linguistics, last week resigned to protest Elsevier’s policies on pricing and its refusal to convert the journal to an open-access publication that would be free online. As soon as January, when the departing editors’ noncompete contracts expire, they plan to start a new open-access journal to be called Glossa.

The article tries to explain how much it costs for a library to subscribe, but that’s not fully possible because Elsevier’s pricing structure pretty much requires libraries to buy inconsistently-priced “bundles.”

Elsevier has responded in a way that is likely to make no one happy, not even Elsevier.

Imagine a world in which the works of scholars are available to anyone who is interested. What a concept! A hearty thank you to the board of Lingua.


The tireless Peter Suber has a list of similar “Declarations of Independence” by journals.

October 23, 2015

Does the networking of meaning destroy meaning?

Donatella Della Ratta, at Copenhangen University and a Berkman fellow, has posted a remarkable essay and linked to another.

The link is to a Wired article by Andy Greenberg about the New Palmyra Project, an effort to reconstruct the ancient monuments ISIS is destroying, and a plea for action to free the project’s creator, Bassel Khartabil, from a Syrian prison.

The second is Donatella’s article in CyberOrient that considers efforts that, like the New Palmyra Project, reconstruct sites destroyed by war, but not with that project’s historical purpose. In the article she brings to light some of the profound and disturbing ways the Net is changing how meaning works.

Her focus is on what she calls “expanded places,” physical places that have been physically destroyed, but that “have been re-animated through multiple mediated versions circulating and re-circulating on the networks.” As she says in the article’s abstract:

Thriving on the techno-human infrastructure of the networks, and relying on the endless proliferation of images resulting from the loss of control of image-makers over their own production, expanded places are aggregators of new communities that add novel layers of signification to the empirical world, and create their own multiple realities and histories.

Her primary example is Damescene Village, a theme park on the outskirts of Damascus where she conducted ethnographic research in 2010. The brief story of the role that theme park played in Syrian“the multiple layers of unreality that it attracted itself is mind-blowing” popular media, and the multiple layers of unreality that it attracted itself is mind-blowing: “a physical replica of the historic 1920s rebel stronghold conceived as a TV set for a reenactment drama of that very struggle; which, historically speaking, took place exactly in the location where the fictional copy had been rebuilt for the sake of media consumption.” To complete the media hall of mirrors, in the recent conflict each side shot “video accounts narrating the seizure of the theme park using themes, symbols and characters borrowed from the TV series.”

Eventually the Damascene Village was destroyed; yet, the self-shot videos, once uploaded onto YouTube, continued to fuel the spread of clashing narratives and contradictory understandings of national resistance, which turned a physical site hosting a staged representation of a conflict into a conflict zone itself, endlessly reproduced through social networking sites.

The complexity of this place as real, symbolic, organic, and manipulated is mirrored in the nature of the platform. She argues that the Internet’s “circulation, reflexivity, anonymity, and decentralized authorship” lead to a type of violence against meaning: “…the endless circulation of messages that are shared, manipulated, and repeated over and over again in a loop where any possible meaning is lost.” Citing Jodi Dean, Donatella says: “…the uncontrollable speed and spread of contributions over the networks help prevent the formation of any sort of signification,” generating not “a plurality of visions” but “…a feeling of ‘constituent anxiety.'” This process is, she says “inherent to the networks.”

A novel space has been created by the entanglement of warfare and technology, where lines are blurred between the physical, lived experiences of war and their media representations, which have gained a new existence by virtue of the endless circulation of the layering of times, spaces, and people enabled by the networks.

This new environment, defined around what I call “expanded places,” re-establishes the relationship between violence and visibility, and broadens the very idea of conflict. Here, mediated and symbolic languages are employed to perform and legitimize the violence perpetrated in physical spaces. At the same time, the large scale production and reproduction of this very violence through networked forms and formats serves to actualize and rationalize it, its viral circulation being endlessly nurtured and boosted by the techno-human structure of the networks.

But is Damescene Village is too good an example? It came onto the Net with so many layers of contested meta-meta-meaning that perhaps its online life is atypical. Donatella confronts this question, “ the Net not only continues the alienation of images of violence … but adds a participatory level”arguing that the Net not only continues the alienation of images of violence from their actuality and from ethical responses, as noted by Susan Sontag in the 1970s, but adds a participatory level to this: the images of violence are hyperlinked and recirculated by the viewers themselves. This borderless remixing and recirculation “have all contributed to the expansion of the place formerly known as the Damascene Village.”

But what to make of this expansion? Here again I worry that Donatella’s example is too good:

As shown by the story of the Damascene Village, the same symbolic and visual reference (Bab al hara) can be employed simultaneously by opposing factions (the Syrian army and the armed rebels) to produce contrasting narratives of resistance, and clashing ideas of nationhood. It can both serve to evoke a seemingly inclusive multiculturalism promoted under al Asad’s leadership; and, at the same time, to remind us that an entire nation is being besieged, not by occupying foreign forces but by the Syrian regime.

She takes this as a type of fictionality, as described by Jacques Rancière: a rearrangement of something real into new political and aesthetic formats without regard to the truth of that something, blurring “the logic of facts and the logic of fiction” in multiple layers of meaning. She invokes Baudrillard, saying that “The story of the Damascene Village proves that it does not really matter” whether the various factions’ fantasies correspond to historical truth. Rather:

what it is important to reflect upon is that this very fantasy has been used to generate and reproduce violence from opposite armed factions, both of which have employed mediated and networked languages to claim legitimacy over their own idea of homeland and national resistance.

But hasn’t that statement been true of every intra-cultural conflict? The truth of historians has never much mattered to factions trying to rouse support for their side. Donatella uses Rancière’s thought to find the difference between how this worked “the Net is in important ways moving us back to a simpler relation between image and reality through the posting of cellphone videos of police attacks, ”before and after the Net. I have not read him (I know, I know) but am not fully convinced by the ideas she cites. In the modern era, “technology is not understood as a mere technique of reproduction and transmission.” Yes, but that’s hardly new to the Internet. Not only has it been well understood at least since the 1960s, but one could argue that the Net is in important ways moving us back to a simpler relation between image and reality through the posting of cellphone videos of police attacks, the proliferation of video surveillance, and the new insistence that the police wear video cameras. Also: Russian dash cams.

She cites Rancière further to make the case that the anonymity of Net postings and the ability to record just about everything “has given rise a new understanding of history as a continuous process of assigning meanings to material realities, of connecting signs and symbols in unprecedented ways. In this sense we can define history as a ‘new form of fiction’…”

I have a complex reaction to this. (This is one of the reasons I so like Donatella’s writing.)

1. Yes, this is exactly what’s happening.

2. It is what happens when we all have access to the materials of history, and the decisions about what counts as history are not made by handfuls of people who control the media, which includes highly qualified historians, the editorial staffs of (sometimes scurrilous) newspapers, and self-interested political leaders.

3. If we substitute “current events” for “history,” the situation seems somewhat less novel. The word “history” carries with it a weight that “current events” does not. (a) We do not yet know what history (as practiced by that discipline) will say about current events. It may become far more settled than the fracturing of interpretations of current events now suggests, which depends to a large degree on how education and authority evolves over the years. (b) History of course always is fractured along the lines that divide people; one side in the United States Civil War still sometimes insists slavery was not the issue the war was fought over.

I am not disagreeing with the dangerousness of the fragmenting of interpretations engendered by the Net. I find illuminating and helpful Donatella’s brilliant exposition of the way in which these are not shards so much as multiply reflecting mirrors in which meanings cannot be separated from the act of meaning, and that act “meanings cannot be separated from the act of meaning, and that act of meaning is a performance that gets reflected, reappropriated, and reenacted without end ”of meaning is a performance that gets reflected, reappropriated, and reenacted without end and without the ability to see its source either in the actual world or in its initial expression — “the rise of the anonymous subject and decentralized authorship nurtured by virtue of the circularity and reflexivity of the networks.” Rancière says this creates “‘uncertain communities'” politically questioning “‘the distribution of roles, territories, and languages’.” That’s an important point, although these images also sometimes create powerful political communities, as was the case with images from Ferguson.

Donatella is admirably focused on what this means when the stakes are high:

…in expanded places that have been destroyed by violence and warfare, then have been re-born through a networked after-life, this process goes much further. Here, challenging the distribution of the sensible [Rancière’s term] is not only a matter of contentious politics, but of generating and regenerating violence and destruction through the endless circulation of formats of violence boosted by the inner techno-human structure of the networks.

Her presentation of the ways in which the Net leads to not just a fracturing of meaning but of an impossibly self-reflective entanglement of meaning is brilliant. Her drawing our attention to the direness of this when it comes to the most dire of human situations is crucial. Her concept of “expanded spaces” seems to me to be worth holding on to and exploring. In fact, it’s powerful enough that I don’t think it should be confined to places that have been destroyed, much less destroyed by war. It applies more broadly than that. Her discussion of places destroyed by violence seems to me to point to a case where the stakes are higher, but where the game is essentially the same.



I recognize I have not resolved the question posed in my title. You can thank Donatella for that :)


October 7, 2015

[liveblog] The future of libraries

I’m at a Hubweek event called “Libraries: The Next Generation.” It’s a panel hosted by the Berkman Center with Dan Cohen, the executive director of the DPLA; Andromeda Yelton, a developer who has done work with libraries; and Jeffrey Schnapp of metaLab

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Sue Kriegsman of the Center introduces the session by explaining Berkman’s interest in libraries. “We have libraries lurking in every corner…which is fabulous.” Also, Berkman incubated the DPLA. And it has other projects underway.

Dan Cohen speaks first. He says if he were to give a State of the Union Address about libraries, he’d say: “They are as beloved as ever and stand at the center of communities” here and around the world. He cites a recent Pew survey about perspectives on libraries:“ …libraries have the highest approval rating of all American institutions. But, that’s fragile.” libraries have the highest approval rating of all American institutions. But, he warns, that’s fragile. There are many pressures, and libraries are chronically under-funded, which is hard to understand given how beloved they are.

First among the pressures on libraries: the move from print. E-book adoption hasn’t stalled, although the purchase of e-books from the Big Five publishers compared to print has slowed. But Overdrive is lending lots of ebooks. Amazon has 65% of the ebook market, “a scary number,” Dan says. In the Pew survey a couple of weeks ago, 35% said that libraries ought to spend more on ebooks even at the expense of physical books. But 20% thought the opposite. That makes it hard to be the director of a public library.

If you look at the ebook market, there’s more reading go on at places like the DPLA. (He mentions the StackLife browser they use, that came out of the Harvard Library Innovation Lab that I used to co-direct.) Many of the ebooks are being provided straight to a platform (mainly Amazon) by the authors.

There are lots of jobs public libraries do that are unrelated to books. E.g., the Boston Public Library is heavily used by the homeless population.

The way forward? Dan stresses working together, collaboration. “DPLA is as much a social, collaborative project as it is a technical project.” It is run by a community that has gotten together to run a common platform.

And digital is important. We don’t want to leave it to Jeff Bezos who “wants to drop anything on you that you want, by drone, in an hour.”

Andromeda: She says she’s going to talk about “libraries beyond Thunderdome,” echoing a phrase from Sue Kriegman’s opening comments. “My real concern is with the skills of the people surrounding our crashed Boeing.” Libraries need better skills to evaluate and build the software they need. She gives some exxamples of places where we see a tensions between library values and code.

1. The tension between access and privacy. Physical books leave no traces. With ebooks the reading is generally tracked. Overdrive did a deal so that library patrons who access ebooks get notices from Amazon when their loan period is almost up. Adobe does rights management, with reports coming page by page about what people are reading. “Unencrypted over the Internet,” she adds. “You need a fair bit of technical knowledge to see that this is happening,” she says. “It doesn’t have to be this way.” “It’s the DRM and the technology that have these privacy issues built in.”

She points to the NYPL Library Simplified program that makes it far easier for non-techie users. It includes access to Project Gutenberg. Libraries have an incentive to build open architectures that support privacy. But they need the funding and the technical resources.

She cites the Library Freedom Project that teaches librarians about anti-surveillance technologies. They let library users browse the Internet through TOR, preventing (or at least greatly inhibit) tracking. They set up the first library TOR node in New Hampshire. Homeland Security quickly suggested that they stop. But there was picketing against this, and the library turned it back on. “That makes me happy.”

2. Metadata. She has us do an image search for “beautiful woman” at Google. They’re basically all white. Metadata is sometimes political. She goes through the 200s of the Dewey Decimal system: 90% Christian. “This isn’t representative of human knowledge. It’s representative of what Melvil Dewey thought maps to human knowledge.” Libraries make certain viewpoints more computationally accessible than others.“ Our ability to write new apps is only as good as the metadata under them.” Our ability to write new apps is only as good as the metadata under them. “As we go on to a more computational library world — which is awesome — we’re going to fossilize all these old prejudices. That’s my fear.”

“My hope is that we’ll have the support, conviction and empathy to write software, and to demand software, that makes our libraries better, and more fair.”

Jeffrey: He says his peculiar interest is in how we use space to build libraries as architectures of knowledge. “Libraries are one of our most ancient institutions.” “Libraries have constantly undergone change,” from mausoleums, to cloisters, to warehouses, places of curatorial practice, and civic spaces. “The legacy of that history…has traces of all of those historical identities.” We’ve always faced the question “What is a library?” What are it’s services? How does it serve its customers? Architects and designers have responded to this, assuming a set of social needs, opportunities, fantasies, and the practices by which knowledge is created, refined, shared. “These are all abiding questions.”

Contemporary architects and designers are often excited by library projects because it crystallizes one of the most central questions of the day: “How do you weave together information and space?” We’re often not very good at that. The default for libraries has been: build a black box.

We have tended to associate libraries with collections. “If you ask what is a library?, the first answer you get is: a collection.” But libraries have also always been about the making of connections, i.e., how the collections are brought alive. E.g., the Alexandrian Librarywas a performance space. “What does this connection space look like today?” In his book with Matthew Battles, they argue that while we’ve thought of libraries as being a single institution, in fact today there are now many different types of libraries. E.g., the research library as an information space seems to be collapsing; the researchers don’t need reading rooms, etc. But civic libraries are expanding their physical practices.

We need to be talking about many different types of libraries, each with their own services and needs. The Library as an institution is on the wane. We need to proliferate and multiply the libraries to serve their communities and to take advantage of the new tools and services. “We need spaces for learning,” but the stack is just one model.


Dan: Mike O’Malley says that our image of reading is in a salon with a glass of port, but in grad school we’re taught to read a book the way a sous chef guts a fish. A study says that of academic ebooks, 75% of scholars read less than 50 pages of them. [I may have gotten that slightly wrong. Sorry.] Assuming a proliferation of forms, what can we do to address them?

Jeffrey: The presuppositions about how we package knowledge are all up for grabs now. “There’s a vast proliferation of channels. ‘And that’s a design opportunity.’”There’s a vast proliferation of channels. “And that’s a design opportunity.” How can we create audiences that would never have been part of the traditional distribution models? “I’m really excited about getting scholars and creative practitioners involved in short-form knowledge and the spectrum of ways you can intersect” the different ways we use these different forms. “That includes print.” There’s “an extraordinary explosion of innovation around print.”

Andromeda: “Reading is a shorthand. Library is really about transforming people and one another by providing access to information.” Reading is not the only way of doing this. E.g., in maker spaces people learn by using their hands. “How can you support reading as a mode of knowledge construction?” Ten years ago she toured Olin College library, which was just starting. The library had chairs and whiteboards on castors. “This is how engineers think”: they want to be able to configure a space on the fly, and have toys for fidgeting. E.g., her eight year old has to be standing and moving if she’s asked a hard question. “We need to think of reading as something broader than dealing with a text in front of you.”

Jeffrey: The DPLA has a location in the name — America &#8212. The French National Library wants to collect “the French Internet.” But what does that mean? The Net seems to be beyond locality. What role does place play?

Dan: From the beginning we’ve partnered with Europeana. We reused Europeana’s metadata standard, enabling us to share items. E.g., Europeana’s 100th anniversary of the Great War web site was able to seamlessly pull in content from the DPLA via our API, and from other countries. “The DPLA has materials in over 400 languages,” and actively partners with other international libraries.

Dan points to Amy Ryan (the DPLA chairperson, who is in the audience) and points to the construction of glass walls to see into the Boston Public Library. This increases “permeability.” When she was head of the BPL, she lowered the stacks on the second floor so now you can see across the entire floor. Permeability “is a very smart architecture” for both physical and digital spaces.

Jeff: Rendering visible a lot of the invisible stuff that libraries do is “super-rich,” assuming the privacy concerns are addressed.

Andromeda: Is there scope in the DPLA metadata for users to address the inevitable imbalances in the metadata?

Dan: We collect data from 1,600 different sources. We normalize the data, which is essential if you want to enable it for collaboration. Our Metdata Application Profile v. 4 adds a field for annotation. Because we’re only a dozen people, we haven’t created a crowd-sourcing tool, but all our data is CC0 (public domain) so anyone who wants to can create a tool for metadata enhancement. If people do enhance it, though, we’ll have to figure out if we import that data into the DPLA.

Jeffrey: The politics of metadata and taxonomy has a long history. The Enlightenment fantasy is for a universal metadata school. What does the future look like on this issue?

Andromeda: “You can have extremely crowdsourced metadata, but then you’re subject to astroturfing”You can have extremely crowdsourced metadata, but then you’re subject to astroturfing and popularity boosting results for bad reasons. There isn’t a great solution except insofar as you provide frameworks for data that enable many points of view and actively solicit people to express themselves. But I don’t have a solution.

Dan: E.g., at DPLA there are lots of ways entering dates. We don’t want to force a scheme down anyone’s throat. But the tension between crowdsourced and more professional curation is real. The Indianapolis Museum of Art allowed freeform tagging and compared the crowdsourced tags vs. professional. Crowdsourced: “sea” and “orange” were big, which curators generally don’t use.


Q: People structure knowledge differently. My son has ADHD. Or Nepal, where I visited recently.

A: Dan: It’s great that the digital can be reformatted for devices but also for other cultural views. “That’s one of the miraculous things about the digital.” E.g., digital book shelves like StackLife can reorder themselves depending on the query.

Jeff: Yes, these differences can be profound. “Designing for that is a challenge but really exciting.”

Andromeda: This is a why it’s so important to talk with lots of people and to enable them collaborate.

me: Linked data seems to resolve some of these problems with metadata.

Dan: Linked Data provides a common reference for entities. Allows harmonizing data. The DPLA has a slot for such IDs (which are URIs). We’re getting there, but it’s not our immediate priority. [Blogger’s perogative: By having many references for an item linked via “sameAs” relationships can help get past the prejudice that can manifest itself when there’s a single canonical reference link. But mainly I mean that because Linked Data doesn’t have a single record for each item, new relationships can be added relatively easily.]

Q; How do business and industry influence libraries? E.g., Google has images for every place in the world. They have scanned books. “I can see a triangulation happening. Virtual libraries? Virtual spaces?

Andromeda: (1) Virtual tech is written outside of libraries, almost entirely. So it depends on what libraries are able to demand and influence. (2) Commercial tech sets expectations for what users experiences should be like, which libraries may not be able to support. (3) “People say “Why do we need libraries? It’s all online and I can pay for it.” No, it’s not, and no, not everyone can.”People say “Why do we need libraries? It’s all online and I can pay for it.” No, it’s not, and no, not everyone can. Libraries should up their tech game, but there’s an existential threat.

Jeffrey: People use other spaces to connect to knowledge, e.g. coffee houses, which are now being incorporated into libraries. Some people are anxious about that loss of boundary. Being able to eat, drink, and talk is a strong “vision statement” but for some it breaks down the world of contemplative knowledge they want from a library.

Q: The National Science and Technology Library in China last week said they have the right to preserve all electronic resources. How can we do that?

Dan: Libraries have long been sites for preservation. In the 21st century we’re so focused on getting access now now now, we lose sight that we may be buying into commercial systems that may not be able to preserve this. This is the main problem with DRM. Libraries are in the forever business, but we don’t know where Amazon will be. We don’t know if we’ll be able to read today’s books on tomorrow devices. E.g., “I had a subscription to Oyster ebook service, but they just went out of business. There go all my books. ”I had a subscription to Oyster ebook service, but they just went out of business. There go all my books. Open Access advocacy is going to play a critical role. Sure, Google is a $300B business and they’ll stick around, but they drop services. They don’t have a commitment like libraries and nonprofits and universities do to being in the forever business.

Jeff: It’s a huge question. It’s really important to remember that the oldest digital documents we have are 50 yrs old which isn’t even a drop in the bucket. There’s far from universal agreement about the preservation formats. Old web sites, old projects, chunks of knowledge, of mine have disappeared. What does it mean to preserve a virtual world? We need open standards, and practices [missed the word] “Digital stuff is inherently fragile.”

Andromeda: There are some good things going on in this space. The Rapid Response Social Media project is archiving (e.g., #Ferguson). Preserving software is hard: you need the software system, the hardware, etc.

Q: Distintermediation has stripped out too much value. What are your thoughts on the future of curation?

Jeffrey: There’s a high level of anxiety in the librarian community about their future roles. But I think their role comes away as reinforced. It requires new skills, though.

Andromeda: In one pottery class the assignment was to make one pot. In another, it was to make 50 pots. The best pots came out of the latter. When lots of people can author lots of stuff, it’s great. That makes curation all the more critical.

Dan: the DPLA has a Curation Core: librarians helping us organize our ebook collection for kids, which we’re about to launch with President Obama. Also: Given the growth in authorship, yes, a lot of it is Sexy Vampires, but even with that aside, we’ll need librarians to sort through that.

Q: How will Digital Rights Management and copyright issues affect ebooks and libraries? How do you negotiate that or reform that?

Dan: It’s hard to accession a lot of things now. For many ebooks there’s no way to extract them from their DRM and they won’t move into the public domain for well over 100 years. To preserve things like that you have to break the law — some scholars have asked the Library of Congress for exemptions to the DMCA to archive films before they decay.

Q: Lightning round: How do you get people and the culture engaged with public libraries?

Andromeda: Ask yourself: Who’s not here?

Jeffrey: Politicians.

Dan: Evangelism

September 29, 2015

BREAKING NEWS: The New Republic runs an article that does not bash the Internet!

Stop the presses!

The good news is that the New Republic seems to be making an effort to include articles about race that are not by white liberals — not that I have anything general against white liberals since I am one . The even better news is that that article credits the Internet with enabling a flowering of African American intellectual thought, rather than the magazine once again (and again and again and again) thinking it’s being oh-so-daring by criticizing the Net as the source of all that is dumb and crass.

In “Think Out Loud,” Michael Eric Dyson argues:

Along with [Ta-Nehisi] Coates, a cohort of what I would like to call the “black digital intelligentsia” has emerged. They wrestle with ideas, stake out political territory, and lead, very much in the same way that my generation did, only without needing, or necessarily wanting, a home in the Ivy League—and by making their name online.

He describes how “the Net enables these voices to be heard”the Net enables these voices to be heard, and how it helps them to form and pursue their ideas through community and social engagement. (It’s a great example of what some of us would describe as the networking of knowledge.)

And, in a generous way that embodies the best of the Net, Dyson in this article is using his position as a well-established voice to give a boost to the upcoming cohort—one that notably includes many women.

Nicely done all around.

August 10, 2015

[2b2k] Sharing the credit when knowledge gets big

The Wall Street Journal has run an article by Robert Lee Hotz that gently ridicules scientists for including thousands of people as co-authors of some scientific publications. Sure, a list of 2,000 co-authors is risible. But the article misses some of the reasons why it’s not.

As Robert Lee points out, “experiments have gotten more complicated.” But not just by a little. How many people did it take to find the Higgs Boson particle? In fact, as Michael Nielsen (author of the excellent Reinventing Discovery) says, how many people does it take to know that it’s been found? That knowledge depends on deep knowledge in multiple fields, spread across many institutions and countries.

In 2012 I liveblogged a fantastic talk by Peter Galison on this topic. He pointed to an additional reason: it used to be that engineers were looked upon as mere technicians, an attitude mirrored in The Big Bang (the comedy show, not the creation of the universe—so easy to get those two confused!). Over time, the role of engineers has been increasingly appreciated. They are now often listed as co-authors.

In an age in which knowledge quite visibly is too big to be known by individuals, sharing credit widely more accurate reflects its structure.

In fact, it becomes an interesting challenge to figure out how to structure metadata about co-authors so that it captures more than name and institution and does so in ways that make it interoperable. This is something that my friend Amy Brand has been working on. Amy, recently named head of the MIT University Press is going to be a Berkman Fellow this year, so I hope this topic will be a subject of discussion at the Center.

August 2, 2015

[2b2k][liveblog] Wayne Wiegand: Libraries beyond information

Wayne Wiegand is giving the lunchtime talk at the Library History Seminar XIII at Simmons College. He’s talking about his new book Part of Our Lives: A People’s History of the American Public Library.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

He introduces himself as a humanist, which brings with it a curiosity about what it means to be a human in the world. He is flawed, born into a flawed culture. He exercises his curiosity in the field of library history. [He’s also the author of the best biography of Melvil Dewey.]

People love libraries, he says, citing the Pew Internet 2013 survey that showed that almost all institutions except libraries and first responders have fallen in public esteem. His new book traces the history of the public library by listening to people who have used them since the middle of the 19th century, a bottom-up perspective. He did much of his research by searching newspaper archives, finding letters to the editors as well as articles. =People love their libraries for (1) the info they make accessible, (2) the public space, and (3) the stories they circulate that make sense of their world.

Thomas Edison spent as much time as possible in the library. The Wright Brothers came upon an ornithology book that kindled their interest in flight. HS Truman cited the library as influential. Lilly Tomlin, too. Bill Clinton, too, especially loving books about native Americans. Barack Obama, too. “The first place I wanted to be was a library,” he said when he returned from overseas. He was especially interested in Kenya, the home of his father.

For most of its history, library info science discourse has focused on what was “useful knowledge” in the 19th century, “best books” in the 20th century, or what we now call “information.” Because people don’t have to use libraries (unlike, say, courts) users have greatly influenced the shape of libraries.

“To demonstrate library as place, let me introduce you to Ricky,” he says as he starts a video. She is an adult student who does her homework in the library. When she was broke, it was a warm place where she could apply for jobs.” She has difficulty working through her emotions to express how much the library means to her.

Wayne reads a librarian’s account of the very young MLK’s regular attendance at his public library. James Levine learned to play piano there. In 1969 the Gary Indiana held a talent conference; the Jackson brothers didn’t win, but Michael became a local favorite. [Who won???] In another library, a homeless man–Mr. Conrad– came in and set up a chess board. People listened and learned from him.

“To categorize these activities as information gathering fails to appreciate the richness” of the meaning of the library for these places.

Wayne plays another video. Maria is 95 years old. She started using the library when was 12 or 13 after her family had immigrated from Russia. “That library was everything to me.” Her family could not afford to buy books “and there were some many other servicces, it was library library library all the time.” “I have seen many ugly things. You can’t live all the time with the bad.” The library was something beautiful.

Pete Seeger remembered all his life stories he read in the library.

The young Ronald Reagan read a popular Christian novel, declared himself saved, and had himself baptized. He went to his public library twice a week, mainly reading adventure stories.

Oprah Winfrey’s library taught her that there was a better world and that she could be a part of it.

Sonia Sotamayor buried herself in reading in the public library after her father died when she was nine. Nancy Drew was formative: paying attention, finding clues, reaching logical conclusions.

Wayne plays a video of Danny, a young man who learned about music from CDs in the library, and found a movie that “dropped an emotional anchor down so I didn’t feel like I was floundering” in his sexuality.

Public libraries have always played a role in making stories accessible to everyone. Communities insist that libraries stock a set of stories that the community responds to. Stories stimulate imagination, construct community through shared reading, and make manifest moral weightings.

In his book, Wayne gives story, people, and place equal weight. “Stories and libraries as place has been as important, and for many people, more important than information.” We need to look at how these activities product human subjectivity as community-based. We lack a research base to comprehend the many ways libraries are used.

The death of libraries has been pronounced too early. In 2012, the US has more libraries than ever. Attendance in 2012 dipped because the hours libraries are open went down that year, but for the decade it was up 28%. [May have gotten the number wrong a bit.] In 2012, libraries circulated 2.2B items, up 28% from 2003. And more. [Too fast to capture.] The prophets of doom have too narrow a view of what libraries do and are. “We have to expand the boundaries of our professional discourse beyond information.”

Libraries fighting against budget cuts too often replicate the stereotypes. “Public libraries no longer are warehouses of book” gives credence to the falsehood that libraries ever were that.

He ends by introducing Dawn Logsdon who is working on a film for 2017 titled Free for All: Inside the Public Library. (She’s been taping people at the conference and assures the audience that whatever doesn’t make into the film will be available online.) She shows a few minutes of a prior documentary of hers: Faubourg Treme.

1 Comment »

August 1, 2015

Restoring the Network of Bloggers

It’s good to have Hoder — Hossein Derakhshan— back. After spending six years in an Iranian jail, his voice is stronger than ever. The changes he sees in the Web he loves are distressingly real.

Hoder was in the cohort of early bloggers who believed that blogs were how people were going to find their voices and themselves on the Web. (I tried to capture some of that feeling in a post a year and a half ago.) Instead, in his great piece in Medium he describes what the Web looks like to someone extremely off-line for six years: endless streams of commercial content.

Some of the decline of blogging was inevitable. This was made apparent by Clay Shirky’s seminal post that showed that the scaling of blogs was causing them to follow a power law distribution: a small head followed by a very long tail.

Blogs could never do what I, and others, hoped they would. When the Web started to become a thing, it was generally assumed that everyone would have a home page that would be their virtual presence on the Internet. But home pages were hard to create back then: you had to know HTML, you had to find a host, you had to be so comfortable with FTP that you’d use it as a verb. Blogs, on the other hand, were incredibly easy. You went to one of the blogging platforms, got yourself a free blog site, and typed into a box. In fact, blogging was so easy that you were expected to do it every day.

And there’s the rub. The early blogging enthusiasts were people who had the time, skill, and desire to write every day. For most people, that hurdle is higher than learning how to FTP. So, blogging did not become everyone’s virtual presence on the Web. Facebook did. Facebook isn’t for writers. Facebook is for people who have friends. That was a better idea.

But bloggers still exist. Some of the early cohort have stopped, or blog infrequently, or have moved to other platforms. Many blogs now exist as part of broader sites. The term itself is frequently applied to professionals writing what we used to call “columns,” which is a shame since part of the importance of blogging was that it was a way for amateurs to have a voice.

That last value is worth preserving. It’d be good to boost the presence of local, individual, independent bloggers.

So, support your local independent blogger! Read what she writes! Link to it! Blog in response to it!

But, I wonder if a little social tech might also help. . What follows is a half-baked idea. I think of it as BOAB: Blogger of a Blogger.

Yeah, it’s a dumb name, and I’m not seriously proposing it. It’s an homage to Libby Miller [twitter:LibbyMiller] and Dan Brickley‘s [twitter:danbri ] FOAF — Friend of a Friend — idea, which was both brilliant and well-named. While social networking sites like Facebook maintain a centralized, closed network of people, FOAF enables open, decentralized social networks to emerge. Anyone who wants to participate creates a FOAF file and hosts it on her site. Your FOAF file lists who you consider to be in your social network — your friends, family, colleagues, acquaintances, etc. It can also contain other information, such as your interests. Because FOAF files are typically open, they can be read by any application that wants to provide social networking services. For example, an app could see that Libby ‘s FOAF file lists Dan as a friend, and that Dan’s lists Libby, Carla and Pete. And now we’re off and running in building a social network in which each person owns her own information in a literal and straightforward sense. (I know I haven’t done justice to FOAF, but I hope I haven’t been inaccurate in describing it.)

BOAB would do the same, except it would declare which bloggers I read and recommend, just as the old “blogrolls” did. This would make it easier for blogging aggregators to gather and present networks of bloggers. Add in some tags and now we can browse networks based on topics.

In the modern age, we’d probably want to embed BOAB information in the HTML of a blog rather than in a separate file hidden from human view, although I don’t know what the best practice would be. Maybe both. Anyway, I presume that the information embedded in HTML would be similar to what does: information about what a page talks about is inserted into the HTML tags using a specified vocabulary. The great advantage of is that the major search engines recognize and understand its markup, which means the search engines would be in a position to constructdiscover the initial blog networks.

In fact, has a blog specification already. I don’t see anything like markup for a blogroll, but I’m not very good a reading specifications. In any case, how hard could it be to extend that specification? Mark a link as being to a blogroll pal, and optionally supply some topics? (Dan Brickley works on

So, imagine a BOAB widget that any blogger can easily populate with links to her favorite blog sites. The widget can then be easily inserted into her blog. Hidden from the users in this widget is the appropriate markup. Not only could the search engines then see the blogger network, so could anyone who wanted to write an app or a service.

I have 0.02 confidence that I’m getting the tech right here. But enhancing blogrolls so that they are programmatically accessible seems to me to be a good idea. So good that I have 0.98 confidence that it’s already been done, probably 10+ years ago, and probably by Dave Winer :)

Ironically, I cannot find Hoder’s personal site; is down, at least at the moment.

More shamefully than ironically, I haven’t updated this blog’s blogroll in many years.

My recent piece in The Atlantic about whether the Web has been irremediably paved touches on some of the same issues as Hoder’s piece.


