Joho the Blog » liveblog

June 29, 2014

[aif] Re-imagining public libraries

I’m at an early Sunday morning (7:45am) session on re-imagining libraries with John Palfrey of the DPLA, Brian Bannon (Commissioner of the Chicago Public Library), and Tessie Guillermo (Zero Divide) . It’s moderated by Sommer Mathis (editor of CityLab.com. My seat-mate tells me that many of the people here are from the local library and its board.The audience is overwhelmingly female.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

SM: Libraries are being more used even though people can download books. Are libraries shifting away from being book collections to becoming community centers?

BB: Our missions are so much bigger than our traditional format for distributing knowledge. Over the past 144 years of Chicago Library’s history, we’ve been innovating all along. How many 144-year-old institations are experiencing record-breaking use?

TG: I was on the Aspen Institute’s sessions on libraries that wrote a report around three pillars of libraries: People, place and platform. Platforms are emerging now. They’re gathering up networks of people that can join together to continue to add value.

JP: I agree with Brian’s historical take and Tessie’s theoretical. I’m not as sanguine, though. Libraries are more important in digital age, but support for libraries could erode. Turning into community centers is risky for libraries. A community center is an open space that can be anything, but libraries are specific: in access to knowledge, in what immigrants need to find their way into a new country, to people seeking jobs [and more]. And all these are bound to the specifics of the community.

BB: When I think of community space, we’re Chicago’s largest provider of access to open, free technology, helping new economies, etc., but we do it through the lens of the library. It starts with the idea that everyone should have free and open acess to the leading ideas of the day. We think about our communities and how we can support these aspirations in very specific ways.

TG: Libraries are central to ideas that are shared across communities. We work with Web Junction as a content pusher to libraries around the enrolment of people in the Affordable Care Act. First, people need to enroll and can go to the library for the computer access. They need insurance literacy. Once you choose your plan and see a doctor, you might find out about a health problem, and you can come back to the library to get info curated for you, and then find out where to get community services. All at the library.

JP: Libraries should be the center of communities, but not be community centers.

BB: People are reading more today and in lots of different formats, and libraries have been great conveners of those conversations. On the other side, as the world of information changes, we’ve been experimenting with learning through experience. We wanted to explore the importance of manufacturing to the city. We opened up a lab that exposed people to ideas that would have been hard to understand simply through print.

JP: I saw your awesome innovation lab. Will you have 3D printers in there perpetually or always have the latest tech?

BB: It was supposed to be a 6 month experiment that’s been extended. We do not believe that Chicago Public Library [CPL] should be the city’s hub for 3D printing. We’re now starting to do experiments in data visualization to help people understand Big Data.

TG: It’s hard to talk about the future of libraries without talking about what places in the future will be like. Zoos, museums, etc., are all changing. There will be a lot of experimentation about how residents and community members organize themselves. At yesterday’s Market Future there was a lot of joking about librarians and the sense was that you can only get recommendations through algorithms. [Ack. That was my session. See this Atlantic post, and my comment there..]

Q: Atlanta libraries are helping people complete GEDs and LA libraries are going one step further and are granting HS diplomas. What innovating programming are you hearing about?

JP: Libraries helping people complete GEDs makes total sense. I like the model where libraries are connecting to learning — connected learning like at CPL. A lot of the learning that kids do is interstitial on mobile devices, and libraries can help with that. Hybrid spaces that connect what’s going on online to the real world is a great model.

TG: The use of libraries is increasing but not always the funding. Libraries have to find new sources of revenue…

SM: … not just revenues but being able to quantify the vaue they bring. JP: CPL has led in this.

BB: We worked with Mission Measurements to do that. We looked at the core mission of the library. We’re about supporting democracy but also helping to make our city competitive. So we looked at how we’re supporting the local economy.

BB: We don’t always recognize that there’s a large portion of the world, and parts of Chicago, where people have limited or no access to tech. So we are experimenting with ways to bring the Internet home. We’re launching a program that will let you checkout laptops and a hotspot. But that’s less about the tech than about the support to understand what programs are out there to sustain it and to gain the skills they need.

Q: Both CPL and NYPL won the Knight News Challenge to enable them to do this.

BB: We’ll be lending them for a three week term. NYPL is lending for months. It’s an experiment. But it’s not just about shiny objects. CPL has been acknowledged for experiments, for R&D. The buzz is important to elevating your brand.

JP: There will have to be trade-offs. Maybe libraries will have to spend less on books, on the marginal acquisitions, in order to support these hardware lending programs. That’s controversial but we have to talk about the trade-offs.

BB: Our model for sharing knowledge is changing dramatically because of the law. Our ability to lend physical books vs. digital materials …

JP: In the physical realm we have the right of first sale that lets you do whatever you want with a book, including resell it or lend it. But for digital there’s no first sale. Libraries acquire the digital under a contract that may limit the number of lends. Libraries are in a less good position with e-works.

TG: I’m not in the library world, but maybe librarians become facilitators of networked learning. People are becoming networked through their library cards, which becomes a platform for creating and curating knowledge that’s shared across the library system. If you create a platform where card holders in the virtual space are able to come together to say, e.g., that there are transportation issues in the city that need solving, the librarian can facilitate the coming together of that conversation. The library can be a link to other institutions.

BB: Librarians are moving away from being the experts in finding stuff (research librarians excepted) and becoming more facilitators.

SM: What about curation? Is that more the job of the librarian than ever?

BB: In the traditional sense, no. Curating programs, etc.: yes.

SM: When you were in SF, you were involved in the renovation of 24 neighborhood libraries. What are the challenges?

BB: Part of it is flexibility. We renovated beautiful Carnegie libaries, but they’re not well designed for the modern flow. As the environment changes, so will the spaces. So we were concerned with designing both for today’s needs and for the future. In Chicago we’re designing spaces to support simultaneous activities. E.g., many people using our libraries are coming because they’re a single person running their own busines out of the library. How do we support that? And we have huge usage by families and children, so we’re need to support that as well. So we’re trying to design spaces that support creative play.

TG: In one instance, a yong parent kept hearing people saying they were going to the library. She was curious. It turns out that the local library has lots of family spaces, not little chairs and little books and someone reading to a group. Rather, it’s an extension of the neighborhood. She’s learning parenting and her children are learning how to play together.

JP: In St. Paul they sent up a library space right off a basketball court. I think that’s a great idea.

JP: I was director of Harvard Law Library [Disclosure: where he was my boss] which had a reading room the size of a football stadium that was always filled, but I never saw a kid take a book off a shelf. They were there to study. They have good wifi in the dorms. There’s something about coming to a common space, with librarians there who could help them if they got in trouble. But they’re there using digital materials. We need to figure out how the physical and digital coalesce, but mainly we need to have to figure out how to build collaborative spaces. Boston Public Library is renovating the historic Johnson Building. They’re putting the teens and tweens on the second floor to make the space attractive to them but also to keep them a bit out of the way.

TG: We work with a teen center in the East Bay area of SF. When you walk into the teen center the first thing you see is the library within the center — the libraries services are embedded in the space that they think of as their space.

Q: [Fred Kent, project for Public Spaces] Different African cultures are coming into Winnipeg. They put an African market outside the library. Richmond BC had to move out of their library into a large Wal-mart-like space along with other services. In Perth, the state library took all the library materials off the ground floor and put in cultural activities. The main library Houston is sponsoring an activitation event with SW Airlines. Libraries could become an integral part of the community services. The future of libraries may not be in their own buildings . The architecture of libraries may be very different.

JP: Yes. E.g., the basketball court example.

Q: I hear about the bond problems in Chicago. I don’t hear that in your comments, Brian.

BB: Chicago has been struggling financially and hopefully is coming out of it. CPL saw significant reductions in 2009 and 2011, resulting in a reduction in hours. We’ve brought many of those hours back through a restructuring. It costs about $100M to run the library, but it costs $6B to run the schools. We’re a tiny piece. That tiny investment in libraries as community anchors and for after-school learning has been an important argument for keeping funding in place. Our collections budget is a little less than what we had in SF and we’re three times the size. So, we definitely have issues.

JP But you’re a cheap date. Our high school costs $100M to run and you’re running the entire library system on that.

Q: The Koolhaus-designed library in Seattle has the problem of being filled with homeless people. They’ve thought about relegating a space with showers and bathrooms and washing machines within the library. WDYT?

BB: Homelessness is part of the urban challenge. It’s important that we see libraries as public spaces open to all regardless of their background. We should not create rules to encourage some and discourage others. In SF we experimented with bringing in people to work with the homeless on finding services that can help them. So rather than creating a shelter within the library, I’d rather that we become a resource helping people to find resources.

Q: How can we make these presidential libraries less a monument and more a way to engage the populace?

BB: Presidential libraries are called libraries, but I’m very excited about the prospect of the Obama library aspiring to being a place to learn about democracy and see it in action. I think it’d be great if it happened in an urban space. We’ve been talking with all three organizations trying to bring the Obama Library to Chicago about what role the public library might play.

TG: It’s an opportunity to think about this as being more of a digital, virtual library. The discussion of democracy should not be confined to one physical place.

JP: I’d argue strongly for the blended approach especially with this president. His election combined beautifully the digital with knocking on doors. Also, the DPLA attempts to build a national digital library, backed by National Archives and the Smithsonian among others. We could do something incredibly cool by connecting the digital and the physical.

Q: In tough budgetary times how are acquisitions affected and how is that being used to shape publishers’ behaviors?

BB: Patron driven acquisitions has us buying books when users want them. The question of publishers is tough. Each library on its own doesn’t have much power. Some big city libraries have cut their own deals. We want to make materials available and also for the publishers to be successful.

JP: We haven’t talked here about the role libraries play in preserving knowledge. If all you were to do is provide what people want at that moment, you’d lose. Patron driven acquisition is a good idea in some respect, and libraires and puslihers should be making common cause, but we also should recall that publishers go out of business — major publishers two or three times came to Harvard Law Library asking for copies of their books so they could digitize them; they didn’t have copies.

TG: That’s where you have to be careful about these decisions made by the analytics of usage.

2 Comments »

June 28, 2014

[AIF] Beau Willimon on “House of Cards”

At the Aspen Ideas Festival, Michael Eisner is interviewing the creator of House of Cards, Beau Willimon. I’m not going to attempt to do comprehensive live-blogging.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

The point at which SPOILERS begin is clearly marked below.

Beau’s initial artistic expression was in painting. He was good but just not good enough. He wanted to try something he would fail at, and chose playwriting. He wrote a terrible play called “The Goat Herd.” But it won a prize at Columbia U., where he was a student. He still feels like a fraud as a writer “because you’re always grasping and never quite reaching what you’re after.”

He went to Estonia for a year, the East Village for a year, worked on the Sen. Schumer campaign doing whatever he was asked, worked on the Howard Dean campaign where he was head of press advance in Iowa. He was at the Dean Scream and explains that it was actually inaudible in the room because of all the screaming by Dean’s supporters. The media picked up on it because it confirmed their narrative that “Dean was a loose cannon and unelectable.”

Six months later, he wrote the play that became the movie The Ides of March. Originally it was about Phillip Hoffman’s character, but then it became about Ryan Gosling’s character, which was based on Beau’s friend, Jay Carson. He says that he doesn’t care about whether his characters are likeable; he wants us to be attracted to them, “which is entirely different.” “I can’t write the characters if I think of them as good or evil.” He doesn’t want to judge them. “I put myself in their shoes” and no one thinks of themselves as evil.

Beau had no interest in writing another political movie, but David Fincher, the director called. He watched the British version of House of Cards, which he lauds and says was more tongue in cheek. They worked for a year in complete secrecy on the first episode, and signed up the two stars. They went to HBO and asked for a full season guarantee. Then Netflix said they wanted House of Cards to be the first show they did, and they wanted two full seasons.

SPOILERS BEGIN HERE.

Beau says that House of Cards is quite tame compared to the language and violence on TV today. ([SPOILER:] He says internally they call the threesome scene with Agent Meechum “the Treechum.”) Eisner (who did Happy Days, Laverne and Shirley, Love Boat, etc.) says back in his day, all that counted was likability. He then cites a highly unlikeable action by Francis in House of Cards, involving a subway. But because it was being produced by Netflix, there was no censorship. Eisner recounts an example in which Netflix pushed to include a joke Eisner didn’t like in one of his own productions; that is, Netflix supported the writers against Eisner.

The third season is now being filmed. Half of the scripts are written.

“House of Cards has nothing to do with politics,” he says. “It’s about power.”

HBO has to please its subscribers. Netflix and other producers don’t have to reach all of their subscribers with any single show.

He explains that the shooting schedule has them editing the early episodes even while they’re filming later examples. They’ll go back to fix or change earlier episodes in order to produce a better whole; you can’t do that when you’re shooting normal tv.

Q&A

Q: Was it hard to kill Zooey?

A: It was in the plan from the beginning. Beau had worked out the plot for the first two seasons. “It was important to stick to our guns on that because one of the themes of the show is how much Francis [Kevin Spacey] is capable of.” The prior murder of the Congressman had been opportunistic. So we said, “Ok, we’re going to do this. It could be a total huge mistake … but fuck it, let’s do it.” Similarly, they were warned not to kill the dog in the first episode, but they figured that if a viewer couldn’t handle that, this was not the show for them. (It was a fake dog, of course.)

Q: [missed it]

A: I do have ideas about how it will end. But you never know. E.g., Rachel started out as a minor character, but she was so good that her part was expanded.

Q: The show lets us empathize with the characters. By working through such complex characters, how has that affected your view of people in real life?

A: “When my friends turn to the empty air and start speaking, I get it.” [laughter] He says writing is narcissistic. He only wants to please himself. You hope to learn something about yourself. “I don’t presume to know anything more than others do.” “My life is just a wonderful and screwed up as anyone else’s. I don’t benefit from the investigation of the soul except that when my life is screwed up, I’m acutely aware of it.”

Q: Isn’t politics about power?

A: Politics can be used to achieve practical ends that have nothing to do with power. Everything is power, but not everything is about politics. Although I would say all works of art are about politics. My Fair Lday is political. Happy Days is political. But when you think of power, if you just think of it in terms of politics, you’re doing it a misservice. There are all sorts of power dynamics. Most have to do with our interppersonal relationships. … Unrequited love? Some of these moments are very small: if a little kid throws a snowball at your windsheld and it cracks, what do you do? Do you pull over and speak to the parents, throw a snowball back, keep driving? In that moment a power dynamic is formed. And how you react esbalishes who is in power. All of our relationships are transactional…When you mix that up with characters whose job is to have mastery over power dynamics, it makes for great drama. But I’m far more interested in the power dynamaics in Francis and Claire’s marriage than in Congress. What you remember are Frances and Claire sitting in the window smoking…”

Q: Francis talks so poetically. What motivated you?

A: Because I didn’t want it to suck? Kevin had done 9 months of touring Richard III. I stole the BBC’s version’s direct address, and they stole it from Shakespeare. Done poorly — and we’ve done it poorly at times — it takes you out of the drama. Done right, it makes you complicit with your protagonist. Sometimes it’s heightened. Sometimes it’s a Gafneyism that doesn’t even make sense: ‘Down South we say never slap a man while he’s cvhewin’ tobacco.’ What does that even mean?” By turning to the camera, he’s made us his pal and we’re able to root for him.

 


A few stray points:

1. Beau is intensely likable.

2. I like House of Cards, even though making Francis a murderer shook my faith in the show. Regardless, my main beef with it is that it portrays all of politics as endemically more corrupt than we even think real world politics are. What lends the series such great drama therefore also discourages civic engagement. And since I am highly partisan, I also think it’s inaccurate. But Beau didn’t think to ask my opinion before writing this amazingly well-written and acted series.

3. I now expect to see a scene in Season 3 in which a kid breaks Francis’ windshield with a snowball.

3 Comments »

May 21, 2014

[liveblog] Judith Donath on designing for sociality (“Social Machines”)

Judith Donath is giving a book talk to launch The Social Machine. I read it this weekend and it is a rich work that explores the ways in which good design can improve our online sociality. I’m a fan of Judith’s and am looking forward to seeing what 25-minutes’ worth of ideas she selects to talk about tonight, given the richness of her book.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Judith begins by saying that the theme of the book is the importance of online social interaction and designing for it. Our interfaces may look sophisticated but they’re primitive when it comes to enabling social interaction. She uses a Mark Twain story ["Was the World Made for Man?"] about an oyster’s point of view to remind us that online design isn’t really all that evolved. One big issue: We can’t see the interactions.

We like being with other people, Judith remindsd us. We like seeing how they look, feeling the energy in a room, etc. This is hard to perceive when you’re looking at screen. Our computers connect us to tremendous crowds, but we don’t see the level of activity or the patterns. She shows a work from 25 years ago when she spent a summer in Japan. Her friends were in Boston on computers. The “who” command let her see who was online and how active they were; it was an old-style computer print-out of a list. She came back from Japan trying to design a more useful display. In the early 1990s she came up with “Visual Who,” a text-based visualization of the people online, filterable by interests, etc. She shows some other ways of displaying social network maps, but such maps aren’t yet integrated into the social network interfaces. Maps like these would help manage Facebook’s privacy settings, she ways. Or we could use them as an interface for keeping up with people we haven’t interacted with in a while, etc.

Legibility is a huge issue, she says. Information is non-spatial, so it can be hard to parse. Judith points to the Talk pages where Wikipedia pages are discussed and edited. Fernanda Viegas and Martin Wattenberg did a visualization (History Flow) of the edits on the Chocolate article. This lets you see what’s controversial and what isn’t. They then took the same data and looked not at every edit, but sampled it at fixed times. It’s a much smoother diagram. That shows the reader’s experience, while the first version showed the writers’ version.

Now Judith talks about “Beyond Being There” (a paper by Hollan, Nielsen, Stornetta, et al.). We can do things with these tools that we can’t do face-to-face. (The fact that we’re in public looking at our cell phones indicates that we’re getting some meaningful social connection that way, she says.) Judith shows the interface to “Talking Circles,” [pdf] an interface for audio conferences. It consists of colored circles. When someone speaks, their circle’s inside moves with their voice. Circles that are near each other are able to hear each other. As they move away, they can’t hear each other. So you could have a private conversation over this digital medium.

These interfaces change the social dynamics around a space. E.g., the “Like” economy induces some to use Intagram to try to gather more likes. Judith points to the Karrie Karahalios and Viega’s Conversation Clock“, a table top that shows who spoke when and who overlapped (interrupted) another. E.g., the fact that we’re all being watched (or think we are — Judith references the Panopticon) shapes our behavior. She points to the EU’s decision that Google has to remove links upon user request.

Judith points to a portrait of Queen Elizabeth I, who looks young in a painting done when she was 65. If you think about data as portraying someone, you become aware of the triangle of subject, audience, and painter, each with their own interests. (She says that two years ago another portrait of Elizabeth from the same time and studio shows her looking very old indeed.)

When you think about doing portraits with data, you have to ask how to make something expressive. She points to “The Rhythm of Salience,” a project she created using an existing conversation database. She picked out words that she identified as being about the individuals. At heart, a portrait takes what’s representative of someone, exaggerates it, and shows the salience. She shows the Caricature Generator by Susan Brennan. You can do the same thing with words, e.g. Themail by Fernanda Viegas and Scott Golder. People save their email, but generally they don’t use their archives. People are more interested in keeping the patterns of relationships than in the individual emails. So, Themail shows a histogram of the month-to-month relationship with anyone in your archive. The column shows the volume of messages, but the words that compose the bars show you the dominant words. [I didn't get that exactly right. Sorry]

She ends by showing Personas by Aaron Zinman (and Donath). You type in your name and spits back a little portrait of you. It searches Google for mentions of your name and characterizes it.

All of these raise enormous questions, she says.

Q&A [extra special abbreviated version]

Q: [me] Is this change good? Or pathological? You show an incredibly fluid environment; is this changing our f2f relationships?

A: Jane Jacobs wrote the Life and Death of Great American Cities not to judge cities but to make them better. My book tries to show ways we can use design to make our social relationships better. Right now we deal with one another differently f2f and in the real world. In 10 years, that distinction will be much less pronounced. E.g., as Google Glass type products and better interfaces will have much more important affects on f2f. That’s why it’s important that we think about these issues now.

Colin Maclay: And as danah boyd says, for the youth it’s not offline or online life. It’s just life.

Q: What’s the difference between info that you put up and info about you that others post and use?

A: There’s very little use of pseudonymity online. Usually it’s your real name or you’re anonymous. Judith shops online for most of her stuff, and she reads reviews. But she doesn’t write reviews in part because she doesn’t want her deodorant review to come up when people google her. That’s where pseudonyms come in. Pseudonyms don’t guarantee complete anonymity but for everyday use they enable us to gain control over our lives online.

Q: Nicholas Negroponte: You were doing social networking work decades ago. Why is it taking so long for the evolution we’re waiting for?

A: The Web set design back tremendously. The Web made it easy for everyone to participate, but one of the costs was that the simplicity of the interface of the Web made it hard to do design or to have identities online. It slowed down a lot of social design. Also, the world of design is extremely conservative because companies imitate one another.

Q: GPS is causing a generational difference in how we navigate space…

A: Tech is often designed subconsciously so that there are insiders and outsiders. [I've overly shortened this interchange.]

Q: Email vs. text messaging?

A: There are fashions. Also, IM has its uses…

Q: How can sites guarantee what they intend to provide, e.g., privacy? How can they ensure trust? E.g., people have figured out how to take screencaptures of snapchat, subverting the design.

A: Design doesn’t guarantee things. But we should have spaces where we have good enough privacy. We need better interfaces for this. Also, many things you see online don’t let you have a sense of how big your audience is or how permanent will be what you say. Some of the visualizations I’ve talked about give you a sense of the publicness of what you’re saying.

Q: Pseudonymity does reign supreme on Reddit. And whatever happened to Second Life, which seems to address some of the issues you talked about.

A: About every 7 years, a new avatar-based space comes out, so we’re about due for the next. Our original work with Chat Spaces was in response to The Palace. I’m not a big fan of that type of graphical chat space because they’re trying to reproduce the feeling of being f2f without going “beyond being there. ” E.g., a student [?] wrote a paper on why there are chairs in Second Life. Good question. Q: What about skeuomorphism? That metaphor holds things back. Is it just an art to come up with designs that break the old metaphors?

A: The first part of the book deals with that question. There’s a chapter on metaphor. If your metaphors are too heavy handed, they limit what you can do. E.g., if you use folders, you have to figure out which one to put your email in. If you used labels (tags), you wouldn’t have to make those decisions. A lot of the art of design is learning how to use metaphors so you can do something more abstract while still being legible, and how you can bend the metaphors without breaking them.

Q: How does Internet balkanization affect your viewpoint and affect designers?

A: How do we use language and images to bridge cultures? Designers have to understand what images mean. It’s an enormously difficult problem. It’s crucial to try to be always cognizant of one’s own cultural issues. E.g., Caricatures look different depending on your cultural norms. In the book, I did not write about caricatures of Obama in white and black publications, butepending on what norm you use, you get different results about what’s salient.

Q: If you could give people a visualization of how they behave in negotiations, that could be useful when people get stuck.

A: The Conversation Clock’s design has done some work on this. Who’s saying no? Who’s interrupting. It’s difficult for people to notice.

Q: The iPhone has just moved away from skeuomorphism. Do you know how long it takes for us to move away from this?

A: Much of this has to do with style and fashion.

Be the first to comment »

April 25, 2014

[nextweb] Ancilla Tilia on how we lost our privacy

Ancilla Tilia [twitter: ncilla] is introduced as a former model. She begins by pointing out that last year, when this audience was asked if they were worried about privacy implications of Google Glass. Only two people did. One was her. We have not heard enough from people like Bruce Schneier, she says. She will speak to us as a concerned citizen.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Knowledge is power, she says. Do we want to give away info about ourselves that will be available in perpetuity, that can be used by future governments and corporations? The them of this conf is “Power to the people,” so let’s use our power.

She says she had a dream. She was an old lady talking with her grand-daughter. “What’s this ‘freedom’ thing I’ve been hearing about? The kids at school say the old people used to have it.” She answered, “It’s hard to define. You don’t realize what it is until you stop having it. And you stop having it when you stop caring about privacy.” We lost it step by step, she says. By paying with our bank cards, every transaction was recorded. She didn’t realize the CCD’s were doing face recognition. She didn’t realize when they put RFID chips in everything. And license plate scanners were installed. Fingerprint scanners. Mandatory ID cards. DNA data banks. Banning burqas meant that you couldn’t keep your face covered during protests. “I began to think that ‘anonymous’ was a dirty word.” Eye scanners for pre-flight check. Biometrics. Wearables monitoring brainwaves. Smart TVs watching us. 2013′s mandatory pet chipping. “And little did I know that our every interaction would be forever stored.” “When journalists started dying young, I didn’t feel like being labeled a conspiracy nut.” “I didn’t know what a free society was until I realized it was gone, or that we have to fight for it.”

Her granddaughter looks at her doe-eyed, and Ancilla can’t explain any further.

Be the first to comment »

[nextweb] Marc Smith on the shape of networks

This is a very bare overview of Marc Smith’s talk at The NextWeb [twitter: thenextwebEurope].

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Marc Smith wants to understand how social power works. The SocialMedia Research Foundation want to build the quivalent of the Kodak Brownie, which made photography into an amateur activity. What would a snapshot of a hashtag look like? Twitter doesn’t show you the crowd as it actually is. Crowds are happy, or angry, or whatever. “We’re interested in revealing the shape of the crowd.” That’s what NodeXL does.

Marc would like to make a browser that shows not pages but webs. They have Open Source tools heading this way. See some at NodeXLGraphGallery.org, “the Flickr for networks.” They are aiming at Social Scholarship so scholars can navigate social media and understand it. One obstacle: social data are largely owned by the commercial vendors providing the social tools.

“Who’s the mayor of your hashtag?” Social network maps show you who are the key influencers, what are the subgroups, and, crucially, who bridges the divides.

He points to six different types of nets at Twitter. [I missed them. Sorry.] The network of people talking about tax policy is very divide,d as opposed to a community of friends. Paul Krugman’s broadcast pattern (Krugman at the center) is very different from the First Lady’s which consist of a set of communities talking about her. If you know about these six patterns, you can ask what you want and how you can get there.

You can see the Twitter network for The Next Web here.

1 Comment »

[nextweb] The Open Source Bank of Brewster

I’m at the Next Web conference in Amsterdam. A large cavern is full of entrepreneurs and Web marketing folks, mainly young. (From my end of the bell curve, most crowds are young.) 2,500 attendees. The pening music is overwhelming loud; I can feel the bass as extra beat in my heart, which from my end of the bell curve is not a good feeling. But the message is of Web empowerment, so I’ll stop my whinging.

Boris Veldhuijzen van Zanten recaps the conference’s 30-hour hackathon. 28 apps. One plays music the tempo of which is based upon how fast you’re driving.

First up is Brewster Kahle [twitter: brewster_kahle], founder of the Internet Archive. [I am a huge Brewster fan, of course.]

Brewster 2011

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Brewster begins by saying that the tech world is in a position to redefine how the economy works.

We are now in position to talk about all of things. We can talk about all species, or all books, etc. Can we make universal access to all knowledge? “That’s the Internet dream I signed on for.” A lot of material isn’t on the Internet yet. Internet Archive is a non-profit “but it’s probably the most successful business I’ve run.” IA has all programs for the Apple II, the Atarai, Commodore, etc. IA has 1.5M physical books. “Libraries are starting to throw away books at a velocity.” They’re aiming for 10M books. They have about 1.5M moving images online. “A lot of the issues are working through the rights issues and keeping everyone calm.” 2M auio recordings, mainly live music collections, not CD’s that have been sold. Since 2000 they’ve been recording live tv, 24×7, multiple channels, international. 3m hours of television. They’re making US TV news searcable. “We want to enable everyone to be a Jon Sewart research department.” 3.7M ebooks — 1,500/day. When they digitize a copy that is under copyright, they lend it to one person at a time. “And everyone’s stayed calm.” Brewster thinks 20th century wbooks will never be widely available. And 400B pages available through the Wayback Macine.

So for knowledge, “We’re getting there.”

“We have an opportunity to build on earlier ideas in the software area to build societies that work better.” E.g., the 0.1% in the US sees its wealth grows but it’s flat for everyone else. Our political and economic systems aren’t working for most people. So, we have to “invent around it.” We have “over-propertized” (via Pam Samuelson). National parks pull back from this. The Nature Conservancy is a private effort to protect lande from over-propertization. The NC has more acres than the National Park system.

Brewster wants to show us how to build on free and open software. Brewster worked with Richard Stallman on the LISP Machine. “People didn’t even sign code. That was considered arrogant.” In 1976 Congress made copyright opt out rather than opt in: everything written became copyrighted for life + 50. “These community projects suddenly became property.” MIT therefore sold the LISP Machine to Symbolics, forking the code. Stahlman tried to keep the open code feature-compatible, but it couldn’t be done. Instead, he created the Free Software GNU system. It was a community license, a distributed system that anyone could participate in just be declaring their code to be free software. “I don’t think has happened before. It’s building law structure based on licenses. It’s licenses rather than law.”

It was a huge win, but where do we go from there? Corporate fanaticism about patents, copyright, etc., locked down everything. Open Source doesn’t work well there. We ended up with high tech non-profits supporting the new sharing infrastructure. The first were about administrating free software: E.g., Free Software Foundation, Linux Foundation, LibreOffice, Apache. Then there were advocacy organizations, e.g., EFF. Now we’re seeing these high=tech non=profits going operational, e.g., Wikipedia ($50M), Mozilla ($300M), Internet Archive ($12M), PLoS ($45M). This model works. They give away their product, and they use a community structure under 501c(3) so that it can’t be bought.

This works. They’ve lasted for more than 20 years, wherars even successful tech companies get mashed and mangled if they last 20 years. So, can we build a free and open ecosystem that work better than the current one? Can we define new rules within it?

At Internet ARchive, the $12M goes largely to people. The people at IA spend most of their salaries on housing, up to 60%. Housing costs so much because of debt: 2/3s of the rent you pay goes to pay off the mortgage of the owner. So, how can we make debt-free housing? Then IA wouldn’t have to raise as much money. So, they’ve made a non-proift that owns an apartment building to provide affordable housing for non-profit workers. The housing has a community license so it the building can’t be sold again. “It pulls it out of the market, like stamping software as Open Source.”

Now he’s trying it for banking. About 40% of profits in corporations in the US goes to financial services. So, they built the Internet Credit Union, a non-profit credit union. They opened bitcoins and were immediately threatened by the government. The crdit union closed those accounts but the government is still auditing them every month. The Internet Credit Union is non-profit, member-run, it helps foundation housing, and its not acquirable.

In sum: We can use communities that last via licenes rater than the law.

Q&A

Boris: If you’re a startup, how do you apply this?

A: Many software companies push hard against the status quo. The days are gone when you can just write code and sell it. You have to hack the system. Think about doing non-profit structures. They’ll trust you more.

2 Comments »

March 23, 2014

[wef] Web Tourism

I’m at the first Web Economy Forum, in Cesena, Italy. It is, unfortunately, terribly under-attended, which is a shame since the first session I’ve gone to was quite good. But it’s being webcast, so we can hope that there are people listening who are not in the room.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Note that because of the translation, these notes are especially rough and choppy.

The first speaker is Prof Dr. Wolfgang Georg Arlt from the China Outbound Tourism Research Institute in Germany. Chinese travel is increasing: 1 out of ten world travelers are from China. The Net and online media are highly significant to travelers figuring out where to go. Some celebrities who blog when they travel have 50M followers. The biggest online travel agency has recently changed its characterization from online to mobile travel agency. It’s social media, not Web sites, that get people interested; people want to hear from their social group. China already has twice as many people online as the US does.

He takes the local area as an example. He suggests that for a town like Cesena, the customers are not the busloads of travelers but those who have been around Italy, and are looking to move from sightseeing to experience. A single tourist who discovers a local shop can drive more visitors, but a new deal (about which he cannot yet speak) lets a visitor set up an online shop in China through which the Chinese can buy from the Italian shop. [Nice combination of the social, personal, and mercantile.] He gives an example of a Chinese film star driving lots of traffic to a Tasmanian stuffed bear.

The next speaker, Aurkene Alzua-Sorzabal, says that international markets have grown remarkably, but how much has that benefited local regions? We need new anaytics “to support the intelligent monitoring of visitors, in order to anticipate and improve their performance,” so that we can get new insights in complex industries such as the “hospitality field.” Behind all this is Big Data, but that’s just the raw material. How can we use this data for our businesses?

She talks about some tools her group has developed. First they use Big Data to explore pricing. Every 24 hours, they crawl the data on accommodation prices — 12,000 hotels in Spain, 14K in France, etc. They can then ask question such as what is the average rate for 3 star hotels in Bilbao on a given day, or what is the most economical hotel in Paris for Easter. They can forecast pricing for special events in a locality and its surroundings. They can see the weekend effect in Ireland and across countries. They can see the effect of availability on price. She gives more examples and asks how we can better use the digital world to understand the physical world?

Q: People only trust user-generated content that comes from other travelers.

Q: Italy is the 8th destination for travel in the world. Tourism accounts for 10% of the Italian GDP. We need to find the next big way that tourists book their travel. TripAdvisor is an example of how tourism is changing. Tourism is not just about finding a hotel. And Air Bnb, too.

Wolfgang: When the Chinese come to Venice, they’re looking for Marco Polo. Aside from the airport, there’s nothing there. So, they’ve learned through social media that there’s nothing there about Marco Polo, so they stay away. The Chinese are proud that their culture came to Italy. You should be catering to this need.

Q: We have a great UNESCO heritage in this country. What shoud we do?


Q: Maybe cultural goods aren’t the way to sell tourism in emerging countries. In China, Marco Polo is unknown. Young people in America know Rome only because they’ve played Assassin’s Creed. They know our cars and clothes, not our culture. Culture works in a few countries.


A: Wolfgang: That’s not entirely true. It depends on the segments. Marco Polo is taught as part of Chinese history as bringing Chinese culture to Europe. When we surveyed younger Chinese people, Italy is seen as the home of beautiful men, maybe from the statue of David and soccer players. For travel to Europe the main attraction is blue skies, no pollution.


A: Aurkene: People go somewhere because they have a narrative, perhaps from history of movies. But now they lack narratives. These narratives tell them what they’re looking for in a place. It’s not about places but about narratives.


A: Wolfgang: Yes. Cesena has been the home of three Popes. It’s not about history but about power. This is an image you can build on. This place has inspired people to become powerful.


Q: We can’t sell our homes as a product or as an experience. The relation between the people who come and the people who host are the real opportunity and the next big thing: peer to peer. If you get too many people, you lose the relationships.

Q: We should be demanding open data about tourism.

Q: Are we still welcoming?

A: Wolfgang: It’s not enough to say the customer is king without knowing that you have to greet the Japanese man first and the woman all the way at the end, whereas in China it’s a matter of hierarchy, not gender. So you can’t be welcoming without training.

Wolfgang: The broadest segment isn’t nation but language. If you want peer to peer, you have to share a language. And it’s probably going to turn out to be English.

Be the first to comment »

March 18, 2014

Dean Krafft on the Linked Data for Libraries project

Dean Krafft, Chief Technology Strategist for Cornell University Library, is at Harvard to talk about the Mellon-funded Linked Data for Libraries (LD4L) project he leads. The grantees include Cornell, Stanford, and the Harvard Library Innovation Lab (which is co-sponsoring the talk with ABCD). (I provide nominal leadership for the Harvard team working on this.)

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Dean will talk about the LD4L project by talking about its building blocks. [Dean had lots of information and a lot on the slides. I did a particularly bad job of capturing it.]

Ld4L

Mellon last December put up $1M for a 2-year project that will end in Dec. 2015. The participants are Cornell, Stanford, and the Harvard Library Innovation Lab.

Cornell: Dean Krafft, Jon Corso-Rickert, Brian Lowe, Simeon Warner

Stanford: Tom Cramer, Lynn McRae, Naomi Dushay, Philip Schreur

Harvard: Paul Deschner, Paolo Ciccarese, me

Aim: Create a Scholarly Resource Semantic Info Store model that works within and across institutions to create a network of Linked Open Data to capture the intellectual value that librarians and other domain experts add to info, patterns of usage, and more.

Ld4L wants to have a common language for talking about scholarly materials. – Outcomes: – Create a SRSIS ontology sufficiently expressive to encompass catalog metadata and other contextual elements – Create a SRSIS semantic editing display, and discovery system based on Vitro to support the incremental ingest of semantic data from multiple info sources – Create a Project Hydra-compatible interface to SRSIS, an active triples software component to facilitate easy use of the data

Why use Linked Data?

LD puts the emphasis on the relationships. Everything is related.

Benefits: The connections have meaning. And it supports “many dimensions of nearness”

Dean explains RDF triples. They connect subjects with objects via a consistent set of relationships.

A nice feature of LOD is that the same URL that points to a human-readable page can also be taken as a query to show the machine-readable data.

There’s commonality among references: shared types, shared relationships, shared instances defined as types and linked by relationships.

LOD is great for sharing data. There’s a startup cost, but as you share more data repositories and types, the costs/effort goes up linearly, not at the steeper rate of traditional approaches.

Dean shows the mandatory graphic of a cloud of LOD sources.

Building Blocks

VIVO: Vivo was the inspiration for LD4L. It makes info about researchers discoverable. It’s software, data, a standard, and a community. It connects scientists and scholars through their research and scholarship. It provides self-describing data via shared ontologies. It provides search results enhanced by what it knows. And it does simple reasoning.

Vivo is built on the VIVO/Vitro platform. It has ingest tools, ontology editing tools, instance editing tools, and a display system. It models people, organizations, grants, etc., the relationships among them, and links to URIs elsewhere. It describes people in the process of doing research. It’s discipline-neutral. It uses existing domain terminology to describe the content of research. It’s modular, flexible, and extensible.

VIVO harvests much of its data automatically from verified sources.

It takes a complexity of inputs and makes them discoverable and usable.

All the data in VIVO is public and visible.

Dean shows us a page, and then traverses the network of interrelated authors.

He points out that other institutions are able to mash up their data with VIVO. E.g., the ICTS has info about 1.2M publications that they’ve integrated with VIVO’s data. E.g., you can see research papers created with federal funding but not deposited in PubMed Central.

VIVO is extensible. LASP extended VIVO to include spacecraft. Brown U. is extending it to support the humanities and artistic works, adding “performances,” for example.

The LD4L ontology will use components of the VIVO-ISF ontology. When new ontologies are needed, it will draw upon VIVO design patterns. The basis for SRSIS implementations will be Vitro plus LD4L ontologies. The multi-institution LD4L demo search will adapt VIVOsearch.org.

The 8M items at Cornell have generated billions of triples.

Project Hydra. Hydra is a tech suite and a partnership. You put your data there and can have many different apps. 22 institutions are collaborating.

Fundamental assumption: No single system can provide the full range of repository-based solutions for a given institution’s needs, yet sustainable solutions do require a common repository. Hydra is now building a set of “heads” (UI’s) for media, special collections, archives, etc.

Fundamental assumption: No single institution can build the full range of what it needs, so you need to work with others.

Hydra has an open architecture with many contributors to a common core. There are collaboratively built solution bundles.

Fedora, Ruby on Rails for Blacklight, Solr, etc.

LD4L will create an activeTriples Hyrdra component to mimic ActiveFedora.

Our Lab’s LibraryCloud/ShelfRank is another core element. It provides model for access to library data. Provides concrete example for creating an ontology for usage.

LD4L – the project

We’re now developing use cases. We have 32 on the wiki. [See the wiki for them]

We’re identifying data sources: Biblio, person (VIVO), usage (LibCloud, circ data, BorrowDirect circ), collections (EAD, IRs, SharedShelf, Olivia, arbitrary OAI-PMH), annotations (CuLLR, Stanford DMW, Bloglinks, DBpedia LibGuides), subjects and authorities (external sources). Imagine being able to look at usage across 50 research libraries…

Assembling the Ontology:

VIVO, Open Annotation, SKOS

BibFrame, BIBO, FaBIO

PROV-O, PAV

FOAF, PROVE, Schema.org

CreativeCommons, Dublin Core

etc.

Whenever possible the project will use existing ontologies

Timeline: By the end of the year we hope to be piloting initial ingests.

Workshop: Jan. 2015. 10-12 institutions. Aim: get feedback, make a “sales pitch” to other organizations to join in.

June 2015: Pilot SRSIS instances at Harvard and Stanford. Pilot gather info across all three instances.

Dec. 2015: Instances implemented.

wiki: http://wiki.duraspace.org/display/ld4l

Q&A

Q: Who anointed VIVO a standard?

A: It’s a de facto.

Q: SKOS is considered a great start, but to do anything real with it you have to modify it, and if it changes you’re screwed.

A: (Paolo) I think VIVO uses SKOS mainly for terms, not hierarchies. But I’m not sure.

Q: What are ActiveTriples?

A: It’s a Ruby Gem that serves as an interface for Hydra into a Fedora repository. ActiveTriples will serve the same function for a backend triple store. So you can swap different triple stores into the Fedora repository. This is Simeon Warner’s project.

Q: Does this mean you wouldn’t have to have a Fedora backend to take advantage of Hydra?

A: Yes, that’s part of it.

Q: Are you bringing in GIS linked data?

A: Yes, to the extent that we can and it makes sense to.

A: David Siegel: We have 6M data points from 1.1M Hollis records. LibraryCloud is ingesting them.

Q: What’s the product at the end?

A: We promised Mellon the ontology and instances of LOD based on the ontology at each of the 3 institutions, and search across the three.

Q: Harvard doesn’t have a Fedora backend…

A: We’d like to pull from non-catalog sources. That might well be an OAI-PMH ingest, or some other non-Fedora source.

Q: What is Simeon interested in with regard to Arxiv.org?

A: There isn’t a direct relationship.

Q: He’s also working on ORCID.

A: We have funding to do some level of integration of ORCID and VIVO.

Q: What is the bibliographic scope? BibFrame isn’t really defining items, etc. They’ve pushed it into annotations.

A: We’re interested in capturing some of that. BibFrame is offering most of what we need, but we have to look at each case. Then we communicate with them and hope that BibFrame does most of the work.

Q: Are any of your use cases posit tagging of contents, including by users perhaps with a controlled vocabulary?

A: We’ll be doing tagging at the object level. I’m unsure whether we’re willing to do tagging within the object.

A: [paolo] We assume we don’t have access to the full text.

A: You could always point into our data.

Q: How can we help?

A: We’re accumulating use cases and data sources. If you’re aware of any, let us know.

Q: It’s been hard for libraries to put enough effort into authority control, to associate values comparable across different subject schemes…there’s a lot of work to make things work together. What sort of vocabulary or semantic links will you be using? The hard part is getting values to work across domains.

A: One way to deal with that is to bring together the disparate info. By pulling together enough info, you can sometimes use the network to you figure that out. But in general the disambiguation challenge (and text fields are even worse) is not something we’re going to solve.

Q: Are the working groups institutionally based?

A: No. They’re cross-institution.

[I'm very excited about this project, and about the people working on it.]

Be the first to comment »

March 5, 2014

[berkman] Karim Lakhani on disclosure policies and innovation

Karim Lakhani of Harvard Business School (and a Berkman associate, and a member of the Harvard Institute for Quantititative Social Science) is giving a talk called “How disclosure policies impact search in open innovation, atopic he has researched with Kevin Boudreau of the London Business School.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Karim has been thinking about how crowds can contribute to innovation for 17 years, since he was at GE. There are two ways this happens:

1. Competitions and contests at which lots of people work on the same problem. Karim has asked who wins and why, motives, how they behave, etc.

2. Communities/Collaboration. E.g., open source software. Here the questions are: Motives? Costs and benefits? Self-selection and joining scripts? Partner selection?

More fundamentally, he wants to know why both of these approaches work so well.

He works with NASA, using topcoder.com: 600K users world wide [pdf]. He also works with Harvard Medical School [more] to see how collaboration works there where (as with Open Source) people choose their collaborators rather than having them chosen top-down.

Karim shows a video about a contest to solve an issue with the International Space Station, having to do with the bending of bars (longerons) in the solar collectors when they are in the shadows. NASA wanted a sophisticated algorithm. (See www.topcoder.com/iss) . It was a two week contest, $30K price. Two thousand signed up for it; 459 submitted solutions. The winners came from around the globe. Many of the solutions replicated or slightly exceeded what NASA had developed with its contractors, but this was done in just two weeks simply for the price of the contest prize.

Karim says he’ll begin by giving us the nutshell version of the paper he will discuss with us today. Innovation systems create incentives to exert innovative effort and encourage the disclosure of knowledge. The timing and the form of the disclosures differentiates systems. E.g., Open Science tends to publish when near done, while Open Source tends to be more iterative. The paper argues that intermediate disclosures (as in open source) dampen incentives and participation, yet lead to higher perrformance. There’s more exploration and experimentation when there’s disclosure only at the end.

Karim’s TL;DR: Disclosure isn’t always helpful for innovation, depending on the conditions.

There is a false debate between closed and open innovation. Rather, what differentiates regimes is when the disclosure occurs, and who has the right to use those disclosures. Intermediate disclosure [i.e., disclosure along the way] can involve a range of outputs. E.g., the Human Genome Project enshrined intermediate disclosure as part of an academic science project; you had to disclose discoveries within 24 hours.

Q: What constitutes disclosure? Would talking with another mathematician at a conference count as disclosure?

A: Yes. It would be intermediate disclosure. But there are many nuances.

Karim says that Allen, Meyer and Nuvolari have shown that historically, intermediate disclosure has been an important source of technological progress. E.g., the Wright brothers were able to invent the airplane because of a vibrant community. [I'm using the term "invent" loosely here.]

How do you encourage continued innovation while enabling early re-use of it? “Greater disclosure requirements will degrade incentives for upstream innovators to undertake risky investment.” (Green & Scotchmer; Bessen & Maskin.) We see compensating mechanisms under regimes of greater disclosure: E.g., priority and citations in academia; signing and authorship in Open Source. You may also attract people who have a sharing ethos; e.g., Linus Torvalds.

Research confirms that the more access your provide, the more reuse and sharing there will be. (Cf. Eric von Hippel.) Platforms encourage reuse of core components. (cf. Boudreau 2010; Rysman and Simcoe 2008) [I am not getting all of Karim's citations. Not even close.]

Another approach looks at innovation as a problem-solving process. And that entails search. You need to search to find the best solutions in an uncertain space. Sometimes innovators use “novel combinations of existing knowledge” to find the best solutions. So let’s look at the paths by which innovators come up with ideas. There’s a line of research that assumes that the paths are the essential element to understand the innovation process.

Mathematical formulations of this show you want lots of people searching independently. The broader the better for innovation outcomes. But there is a tendency of the researchers to converge on the initially successful paths. These are affected by decisions about when to disclose.

So, Karim and Kevin Boudreau implemented a field experiment. They used TopCoder, offering $6K, to set up a Med School project involving computational biology. The project let them get fine-grained info about what was going on over the two weeks of the contest.

700 people signed up. They matched them on skills and randomized them into three different disclosure treatments. 1. Standard contest format, with a prize at the end of each week. (Submissions were automatically scored, and the first week prizes went to the highest at that time.) 2. Submitted code was instantly posted to a wiki where anyone could use it. 3. In the first week you work without disclosure, but in the second week submissions were posted to the wiki.

For those whose work is disclosed: You can find and see the most successful. You can get money if your code is reused. In the non-disclosure regime you cannot observe solutions and all communications are bared. In both cases, you can see market signals and who the top coders are.

Of the 733 signups from 69 different countries, 122 coders submitted 654 submissions, with 89 different approaches. 44% were professionals; 56% were students. The skewed very young. 98% men. They spent about 10 hours a week, which is typical of Open Source. (There’s evidence that women choose not to participate in contests like this.) The results beat the NIH’s approach to the problem which was developed at great cost over years. “This tells me that across our economy there are lots of low-performing” processes in many institutions. “This works.”

What motivated the participants? Extrinsic motives matter (cash, job market signals) and intrinsic motives do too (fun, etc.). But so do prosocial motives (community belonging, identity). Other research Karim has done shows that there’s no relation between skills and motives. “Remember that in contests most people are losing, so there have to be things other than money driving them.”

Results from the experiment: More disclosure meant lower participation. Also, more disclosure correlated with the hours worked going down. The incentives and efforts are lower when there’s intermediate disclosure. “This is contrary to my expectations,”Karim says.

Q: In the intermediate disclosure regime is there an incentive to hold your stuff back until the end when no one else can benefit from it?

A: One guy admitted to this, and said he felt bad about it. He won top prize in the second week, but was shamed in the forums.

In the intermediate disclosure regime, you get better performance (i.e., better submission score). In the mixed experiment, performance shot up in the second week once the work of others was available.

They analyzed the ten canonical approaches and had three Ph.D.s tag the submissions with those approaches. The solutions were combinations of those ten techniques.

With no intermediate disclosures, the search patterns are chaotic. With intermedia disclosures, there is more convergence and learning. Intermediate disclosure resulted in 30% fewer different approaches. The no-disclsoure folks were searching in the lower-performance end of the pool. There was more exploration and experimentation in their searches when there was no intermediate disclosure, and more convergence and collaboration when there is.

Increased reuse comes at the cost of incentives. The overall stock of knowledge created is low, although the quality is higher. More convergent behavior comes with intermediate disclosures, which relies on the stock of knowledge available. The fear is that with intermediate disclosure , people will get stuck on local optima — path dependnce is a real risk in intermediate disclosure.

There are comparative advantages of the two systems. Where there is a broad stock of knowledge, intermediate disclosure works best. Plus the diversity of participants may overcome local optima lock-in. Final disclosure [i.e., disclosure only at the end] is useful where there’s broad-based experimentation. “Firms have figured out how to play both sides.” E.g., Apple is closed but also a heavy participant in Open Source.

Q&A

Q: Where did the best solutions come from?

A: From intermediate disclosure. The winner came from there, and then the next five were derivative.

Q: How about with the mixed?

A: The two weeks tracked the results of the final and intermediate disclosure regimes.

Q: [me] How confident are you that this applies outside of this lab?

A: I think it does, but even this platform is selecting on a very elite set of people who are used to competing. One criticism is that we’re using a platform that attracts competitors who are not used to sharing. But rank-order based platforms are endemic throughout society. SATs, law school tests: rank order is endemic in our society. In that sense we can argue that there’s a generalizability here. Even in Wikipedia and Open Source there is status-based ranking.

Q: Can we generalize this to systems where the outputs of innovation aren’t units of code, but, e.g., educational systems or municipal govts?

Q: We study coders because we can evaluate their work. But I think there are generalizations about how to organize a system for innovation, even if the outcome isn’t code. What inputs go into your search processes? How broad do you do?

Q: Does it matter that you have groups that are more or less skilled?

A: We used the Topcoder skill ratings as a control.

Q: The guy who held back results from the Intermediate regime would have won in real life without remorse.

A: Von Hippel’s research says that there are informal norms-based rules that prevent copying. E.g., chefs frown on copying recipes.

Q: How would you reform copyright/patent?

A: I don’t have a good answer. My law professor friends say the law has gone too far to protect incentives. There’s room to pull that back in order to encourage reuse. You can ask why the Genome Project’s Bermuda Rules (pro disclosure) weren’t widely adopted among academics. Academics’ incentives are not set up to encourage automatic posting and sharing.

Q: The Human Genome Project resulted in a splintering that set up a for-profit org that does not disclose. How do you prevent that?

A: You need the right contracts.

This was a very stimulating talk. I am a big fan of Karim and his work.


Afterwards Karim and I chatted briefly about whether the fact that 98% of Topcoder competitors are men raises issues about generalizing the results. Karim pointed to the general pervasiveness of rank-ordered systems like the one at TopCoder. That does suggest that the results are generalizable across many systems in our culture. Of course, there’s a risk that optimizing such systems might result in less innovation (using the same measures) than trying to open those systems up to people averse to them. That is, optimizing for TopCoder-style systems for innovation might create a local optima lock-in. For example, if the site were about preparing fish instead of code, and Japanese chefs somehow didn’t feel comfortable there because of its norms and values, how much could you conclude about optimizing conditions for fish innovation? Whereas, if you changed the conditions, you’d likely get sushi-based innovation that the system otherwise inadvertently optimized against.


[Note: 1. Karim's point in our after-discussion was purely about the generalizability of the results, not about their desirability. 2. I'm trying to make a narrow point about the value of diversity of ideas for innovation processes, and not otherwise comparing women and Japanese chefs.]

Be the first to comment »

December 3, 2013

[berkman] Jérôme Hergeux on the motives of Wikipedians

Jérôme Hergeux is giving a Berkman lunch talk on “Cooperation in a peer prodiuction economy: experimental evidence from Wikipedia.” He lists as co-authors: Yann Algan, Yochai Benkler, and Mayo Fuster-Morell.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Jérôme explains the broader research agenda behind the paper. People are collaborating on the Web, sometimes on projects that compete with or replace major products from proprietary businesses and institutions. Standard economic theory doesn’t have a good way of making sense of this with its usual assumptions of behavior guided by perfect rationality and self-interest. Instead, Jérôme will look at Wikipedia where people are not paid and their contributions have no signaling value on the labor market. (Jérôme quotes Kizor: “The problem with Wikipedia is that it only works in practice. In theory it can never work.”)

Instead we should think of contributing to Wikipedia as a Public Goods dilemma: contributing has personal cost and not enough countervailing personal benefit, but it has a social benefit higher than the individual cost. The literature has mainly focused on the “prosocial preferences” that lead people to include the actions/interets of others, which leads them to overcome the Public Goods dilemma.

There are three classes of models commonly used by economists to explain prosocial behavior:

First, the altruism motive. Second, reciprocity: you respond in kind to kind actions of others. Third, “social image”: contributing to the public good signals something that brings you other utility. (He cites Napoleon: “Give me enough meals and I will win you any war.”)

His research’s method: Elicit the social prefs of a representative sample of Wikipedia contributors via an online experiment, and use those preferences to predict subjects’ field contributions to the Wikipedia project.

To check the reciprocity motive, they ran a simple public goods game. Four people in a group. Each has $10. Each has to decide how much to invest in a public project. You get some money back, but the group gets more. You can condition your contribution on the contributions of the other group members. This enables the researchers to measure how much the reciprocity motive matters to you. [I know I’m not getting this right. Hard to keep up. Sorry.] They also used a standard online trust game: You get some money from a partner, and can respond in kind.

Q: Do these tests correlate with real world behavior?

A: That’s the point of this paper. This is the first comprehensive test of all three motives.

For studying altruism, the dictator game is the standard. The dictator can give as much as s/he wants to the other person. The dictator has no reason to transfer the money. This thus measures altruism. But people might contribute to Wikipedia out of altruism just to their own Wikipedia in-group, not general altruism (“directed altruism”). So they ran another game to measure in-group altruism.

Social image is hard to measure experimentally, so they relied on observational data. “Consider as ‘social signalers’ subjects who have a Wikipedia user page whose size is bigger than the median in the sample.” You can be a quite engaged contributor to Wikipedia and not have a personal user page. But a bigger page means more concern with social image. Second, they looked at Barnstars data. Barnstars are a “social rewarding practice” that’s mainly restricted to heavy contributors: contribute well to a Wikipedia article and you might be given a barnstar. These shows up on Talk pages. About half of the people move it to their user page where it is more visible. If you move one of those awards manually to your user page, Jérôme will count you as a social signaller, i.e., someone who cares about his/her image.

He talks about some of the practical issues they faced in doing this experiment online. They illustrated the working of each game by using some simple Flash animations. And they provided calculators so you could see the effect of your decisions before you make them.

The subject pool came from registered Wikipedia users, and looked at the number of edits the user has made. (The number of contributions at Wikipedia follows a strong power law distribution.) 200,000 people register at Wikipedia account each month (2011) but only 2% make ten contributions in the their first month, and only 10% make one contribution or more within the next year. So, they recruited the cohort of new Wikipedia contributors (190,000 subjects), the group of engaged Wikipedia contributors (at least 300 edits) (18,989), and Wikipedia administrators (1,388 subjects). To recruit people, they teamed up with the Wikimedia Foundation to put a banner up on a Wikipedia page if the user met the criteria as a subject. The banner asked the reader to help with research. If readers click through, they go to the experiment page where they are paid in real money if they complete the 25 minute experiment within eight hours.

The demographics of the experiment’s subjects (1,099) matched quite closely the overall demographics of those subject pools. (The pool had 9% women, and the experiment had 8%).

Jérôme shows the regression tables and explains them. Holding the demographics steady, what is the relation between the three motives and the number of contributions? For the altruistic motive, there is no predictive power. Reciprocity in both games (public and trust) is a highly significant predictive. This tells us that reciprocal preference can lead you from being a non-contributor to being an engaged contributor; once you’re an engaged contributor, it doesn’t predict how far you’re going to go. Social image is correlated with the number of contributions; 81% of people who have received barnstars are super-contributors. Being a social signaler is associated with a 130% rise in the number of contributions you make. By both user-page length and barnstar, social image motivates for more contributions even among super-contributors.

Reciprocity incentivizes contributions only for those who are not concerned about their social image. So, reciprocity and social image are both at play among the contributors, but among separate groups. I.e., if you’re motivated by reciprocity, you are likely not motivated by social image, and vice versa.

Now Jérôme focuses on Wikipedia administrators. Altruism has no predictive value. But Wikipedia participation is negatively associated with reciprocity; perhaps this is because admins have to have thick skins to deal with disruptive users. For social image, the user page has significant revelance for admins, but not barnstars. Social image is less strong among admins than among other contributors.

Jérôme now explores his “thick skin hypothesis” to explain the admin results. In the trust game, look at how much the trustor decides how much to give to the stranger/partner. Jérôme ’s hypothesis: Among admins, those who decide to perform more of their policing role will be less trusting of strangers. There’s a negative correlation among admins between the results from the trust game and their contributions. The more time they say they do admin edits, the less trusting they are of strangers in the tests. That sort of make sense, says Jérôme. These admins are doing a valuable job for which they have self-selected, but it requires dealing with irritating people.

QA

Q: Maybe an admin is above others and is thus not being reciprocated by the group.

A: Perfectly reasonable explanation, and it is not ruled out by the data.

Q: Did you come into this with an idea of what might motivate the Wikipedians?

A: These are the three theories that are prevalent. We wanted to see how well they map onto actual field behavior.

Q: Maybe the causation goes the other way: working in Wikipedia is making people more concerned about social image or reciprocity?

A: The correlations could go in either direction. But we want to know if those explanations actually match what people do in the field.

Q: Heather Ford looks at why articles are deleted for non-Western topics. She found the notability criteria change for people not close to the topics. Maybe the motives change depending on how close you are to the event.

A: Sounds fascinating.

Q: Admins have an inherent bias in that they focus on the small percentage of contributors who are annoying jerks. If you spend your time working with jerks, it affects your sense of trust.

A: Good point. I don’t have the data to answer it.

Q: [me] If I’m a journalist I’m likely to take away the wrong conclusions from this talk, so I want to make sure I’m understanding. For example, I might conclude that Wikipedia admins are not motivated by altruism, whereas the right conclusion is (isn’t it?) that the standard altruism test doesn’t really measure altruism. Why not ask for self-reports to see?

A: Economists are skeptical about self-reports. If the reciprocity game predicts a correlation, that’s significant.

Yochai Benkler: Altruism has a special meaning among economists. It refers to any motivation other than “What’s in it for me?” [Because I asked the question, I didn’t do a good job recording the answers. Sorry.]

Q: Aren’t admins control freaks?

A: I wouldn’t say that. But control is not a pro-social motive, and I wanted to start with the theories that are current.

Q: You use the number of words someone writes on a user page as a sign of caring about social image, but this is in an context where people are there to write. And you’re correlating that to how much they write as editors and contributors. Maybe people at Wikipedia like to write. And maybe they write in those two different places for different reasons. Also, what do you do with these findings? Economists like to figure out which levers we pull if we’re not getting enough contributors.

Q: This sort of data seems to work well for large platforms with lots of users. What’s the scope of the methods you’re using? Only the top 100 web sites in the world?

A: I’d like to run this on all the peer production platforms in the world. Wikipedia is unusual if only because it’s been so successful. We’re already working on another project with 1,000 contributors at SourceForge especially to look at the effects of money, since about half of Open Source contributions are for money.


Fascinating talk. But it makes me want to be very dumb about it, because, well, I have no choice. So, here goes.

We can take this research as telling us something about Wikipedians’ motivations, about whether economists have picked the right three prosocial motivations, or about whether the standard tests of those motivations actually correlate to real-world motivations. I thought the point had to do with the last two alternatives and not so much the first. But I may have gotten it wrong.

So, suppose instead of talking about altruism, reciprocity, and social image we instead talk about the correlation between the six tests the researchers used and Wikipedia contributions. We would then have learned that Test #1 is a good predictor of the contribution levels of beginner Wikipedians, Test #2 predicts contributions by admins, Test #3 has a negative correlation with contributions by engaged Wikipedians, etc. But that would be of no interest, since we have (ex hypothesis) not made any assumptions about what the tests are testing for. Rather, the correlation would be a provocation to more research: why the heck does playing one of these odd little games correlate to Wikipedian productivity? It’d be like finding out that Wikipedian productivity is correlated to being a middle child or to wearing rings on both hands. How fascinating!… because these correlations have no implied explanatory power.

Now let’s plug back in the English terms that indicate some form of motivation. So now we can say that Test #3 shows that scoring high in altruism (in the game) does not correlate with being a Wikipedia admin. From this we can either conclude that Wikipedia admins are not motivated by altruism, or that the game fails to predict the existing altruism among Wikipedia admins. Is there anything else we can conclude without doing some independent study of what motivates Wikipedia admins? Because it flies in the face of both common sense and my own experience of Wikipedia admins; I’m pretty convinced one reason they work so hard is so everyone can have a free, reliable, neutral encyclopedia. So my strong inclination – admittedly based on anecdote and “common sense” (= “I believe what I believe!”) – is to conclude that any behavioral test that misses altruism as a component of the motivation of someone who spends thousands of hours working for free on an open encyclopedia…well, there’s something hinky about that behavioral test.

Even if the altruism tests correlate well with people engaged in activities we unproblematically associate with altruism – volunteering in a soup kitchen, giving away much of one’s income – I’d still not conclude from the lack of correlation with Wikipedia admins that those admins are not motivated by altruism, among other motivations. It just doesn’t correlate with the sort of altruism the game tests for. Just ask those admins if they’d put in the same amount of time creating a commercial encyclopedia.

So, I come out of Jérôme’s truly fascinating talk feeling like I’ve learned more about the reliability of the tests than about the motivations of Wikipedians. Based on Jérôme’s and Yochai’s responses, I think that’s what I’m supposed to have learned, but the paper also seems to be putting forward interesting conclusions (e.g., admins are not trusting types) that rely upon the tests not just correlating with the quantity of edits, but also being reliable measures of altruism, self-image, and reciprocity as motives. I assume (and thus may be wrong) that’s why Jérôme offered an hypothesis to explain the lack-of-trust result, rather than discounting the finding that admins lack trust (to oversimplify it).

(Two concluding comments: 1. Yochai’s The Leviathan and the Penguin uses behavioral tests like these, as well as case studies and observation, to make the case that we are a cooperative species. Excellent, enjoyable book. (Here’s a podcast interview I did with him about it.) 2. I’m truly sorry to be this ignorant.)

1 Comment »

Next Page »