Joho the Blog » [preserve] Anil Dash on archiving the Internet
Everyday Chaos
Too Big to Know
Too Big to Know
Cluetrain 10th Anniversary edition
Cluetrain 10th Anniversary
Everything Is Miscellaneous
Everything Is Miscellaneous
Small Pieces cover
Small Pieces Loosely Joined
Cluetrain cover
Cluetrain Manifesto
My face
Speaker info
Who am I? (Blog Disclosure Form) Copy this link as RSS address Atom Feed

[preserve] Anil Dash on archiving the Internet

Anil Dash (one of my heroes, and is also hilarious) is talking at a Library of Congress event on Digital Preservation, part of the National Digital Information Infrastructure and Preservation Program. Anil’s talk is called “Make a Copy.” (Anil is now at ThinkUp.)

Live Blogging

Getting things wrong. Making fluid talks sound choppy. Missing important points. Not running a spellpchecker. This is not a reliable report. You have been warned, people!

Anil says he’s a geek interested in the social impacts of tech on culture, govt, and more. He started Expert Labs a few years ago to enable tech to talk with policy makers. Expert Labs built ThinkUp. He wants to talk about the issues that this group or archivists confronts every day that the tech community doesn’t know about. He warns us that this means he’s starting with depressing stuff. So…

…Picture the wholesale destruction of your wedding photos, or other deeply personal mementos. They are being destroyed by an exclusive, private, ivy league club: Facebook. FB treats memories as disposable. “Maybe if I were a 25 year old billionaire, I’d think of these as disposable, too.” “The terms of service of digital social networks trumps the Constitution in terms of what people can share and consume.” Our ordinary conversations are treated as disposable, at Facebook, Twitter, Microsoft, etc. They explicitly say that they can delete all of your content at any time for any reason. “100s of millions of Americans have accepted that. That should be troubling to those of us who care about preservation.”

You can opt out, but not without compromising your career and having severe social cost. And you can’t rely upon the rest of the Web, because “there’s a war ranging against the open Web.” “The majority of time spent on the Web in the US is spent in an application,” not on pages. Yet we’re still archiving Web pages but not those applications. “They are gaslighting the Web,” Anil says, referring to the old movie. E.g., you can leave FB comments on Anil’s blog, but when you click from FB to his blog, FB gives you a warning that the site you’re going to is untrustworthy. “I don’t do that to them,” he says, even though they’ve consistently “moved the goal posts” on privacy, and he has registered his site with FB.

After blogging this, Anil got a message from a tech at FB saying that it was a bug that’s being fixed. But suppose he hadn’t blogged it, or FB had missed it? “The best case scenario is that we’re left fixing their bugs.” He adds, “That’s pretty awful, because they’re not fixing our bugs. And we’re helping them to extend their prisons over the Web.” And is the only way to get our words preserved is to agree to Twitter’s ToS so that we’ll get archived by the Library of Congress, which has been archiving tweets. Anil says that he’s conscientiously tried to archive his own works for his new baby, but it shouldn’t rely on that much effort by an individual.

And, he says, that’s just the Web, not the apps. You can’t crawl his phone and preserve his photos. And when FB buys Instagram which has a billion photos, and only 5% of the content FB has bought has been preserved…? And yet the Instagram acquisition is considered a success by the Valley. If you’re a Pharaoh, your words are preserved. Anil is worried about the rest of the conversations.

“If I were to ask you what is the most watched form of video, what would you say?” Anil guesses that it’s animated gifs. And we don’t archive them. “We’re talking about the wrong things.” We’re arguing that we should be using Ogg Vorbis, but the proprietary forms are the ones that are most used. The standards ecology is getting more complicated. “We need to reflect back to the tech community that they have an obligation to think about preservation.” They’ve got money and resources. Shouldn’t they be contributing?

We’re losing metadata, he says. You can’t find Instagram photos because they have no Web presence and are short on metadata. Flickr, on the contrary, has lots of metadata. The Instagram owners are now multi-millionaires and are undermotivated to fix this problem. Maybe we’ll get something in 5 years, but then we will have lost a full decade of people’s photos. There’s no way to assign Instagrams open licenses at this point.

Indeed, “they are bending the law to make archiving illegal.” You can’t hack your own phone. You can’t copy your own photos from one device to another.

“Content tied to devices dies when those devices become obsolete.” The obsolesence cycle is becoming faster every year.

So, what should we do?

The technologists building these devices don’t know about the work of archivists. They don’t know that what this group is doing is meaningful. Many are young and don’t yet have experiences they want to preserve. They may not have confronted their own mortality yet.

But, the Web at its base level is about making copies. So, if we get things on the Web as opposed to in apps, we win. Apps should be powered by, or connected to, a Web experience. How can we take advantage of the fact that every time you go to a Web page, you’re copying it? How can we take advantage of the CDN’s, which are already doing a lot of the work needed for preservation?

“There is also a growing class of apps that want to do the right thing.” E.g., TimeHop, that sends you an email reminding you of what you tweeted, etc., a year ago. This puts a user experience around the work of preservation. They’re marketing the value of the preservation community, but they don’t know it yet. Or Brewster, an iPhone address book that hooks up to all the address books you have on social services, reminding you to connect with people you haven’t touched in a while. This is a preservation app, although Brewster doesn’t know.

Then, how do we mine our personal archives? (He notes that his company’s tool, ThinkUp, is in this space.) His Nike fuel band captures data about his physical activity. The Quantified Self movement is looking at all sorts of data. “They too are preservationists, and they don’t know it.”

Then there are institutions. People revere the Library of Congress. Senior people at Twitter speak in a hushed voice when they say, “The tweets go to the LoC.” Take advantage of the institution’s authority. Don’t be shy. Meet them halfway. And say, “By the way, look at my cool email address.”

“PR trumps ToS.” ThinkUp archived the FB activity of the White House. At the time, FB’s ToS forbid archiving it for more than 24 hours. But the WH policy requires it. I said, “Please, FB, please cut off the White House’.” It turns out that FB was already planning on revising the policy. “What a great conversation we would have gotten to have.” You are our advocates, says Anil. You have an obligation to speak on our behalves.

The public is already violating “Intellectual Property” rules. “We don’t look at YouTube as the Million Mixers March, but that’s what it is.” It’s civil disobedience: People violating the law in public under their own names. These are people who recognize the value of preserving cultural works that otherwise would disappear. Sony won’t sell you a copy of Michael Jackson’s Thriller, but there are copies on YouTube. The heart and soul of those posting those videos is preservation. “All they want to do is what you do: make a copy of what matters to them.”

Previous: « || Next: »

Leave a Reply

Comments (RSS).  RSS icon