Joho the Blog » liveblog

March 5, 2014

[berkman] Karim Lakhani on disclosure policies and innovation

Karim Lakhani of Harvard Business School (and a Berkman associate, and a member of the Harvard Institute for Quantititative Social Science) is giving a talk called “How disclosure policies impact search in open innovation, atopic he has researched with Kevin Boudreau of the London Business School.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Karim has been thinking about how crowds can contribute to innovation for 17 years, since he was at GE. There are two ways this happens:

1. Competitions and contests at which lots of people work on the same problem. Karim has asked who wins and why, motives, how they behave, etc.

2. Communities/Collaboration. E.g., open source software. Here the questions are: Motives? Costs and benefits? Self-selection and joining scripts? Partner selection?

More fundamentally, he wants to know why both of these approaches work so well.

He works with NASA, using topcoder.com: 600K users world wide [pdf]. He also works with Harvard Medical School [more] to see how collaboration works there where (as with Open Source) people choose their collaborators rather than having them chosen top-down.

Karim shows a video about a contest to solve an issue with the International Space Station, having to do with the bending of bars (longerons) in the solar collectors when they are in the shadows. NASA wanted a sophisticated algorithm. (See www.topcoder.com/iss) . It was a two week contest, $30K price. Two thousand signed up for it; 459 submitted solutions. The winners came from around the globe. Many of the solutions replicated or slightly exceeded what NASA had developed with its contractors, but this was done in just two weeks simply for the price of the contest prize.

Karim says he’ll begin by giving us the nutshell version of the paper he will discuss with us today. Innovation systems create incentives to exert innovative effort and encourage the disclosure of knowledge. The timing and the form of the disclosures differentiates systems. E.g., Open Science tends to publish when near done, while Open Source tends to be more iterative. The paper argues that intermediate disclosures (as in open source) dampen incentives and participation, yet lead to higher perrformance. There’s more exploration and experimentation when there’s disclosure only at the end.

Karim’s TL;DR: Disclosure isn’t always helpful for innovation, depending on the conditions.

There is a false debate between closed and open innovation. Rather, what differentiates regimes is when the disclosure occurs, and who has the right to use those disclosures. Intermediate disclosure [i.e., disclosure along the way] can involve a range of outputs. E.g., the Human Genome Project enshrined intermediate disclosure as part of an academic science project; you had to disclose discoveries within 24 hours.

Q: What constitutes disclosure? Would talking with another mathematician at a conference count as disclosure?

A: Yes. It would be intermediate disclosure. But there are many nuances.

Karim says that Allen, Meyer and Nuvolari have shown that historically, intermediate disclosure has been an important source of technological progress. E.g., the Wright brothers were able to invent the airplane because of a vibrant community. [I’m using the term “invent” loosely here.]

How do you encourage continued innovation while enabling early re-use of it? “Greater disclosure requirements will degrade incentives for upstream innovators to undertake risky investment.” (Green & Scotchmer; Bessen & Maskin.) We see compensating mechanisms under regimes of greater disclosure: E.g., priority and citations in academia; signing and authorship in Open Source. You may also attract people who have a sharing ethos; e.g., Linus Torvalds.

Research confirms that the more access your provide, the more reuse and sharing there will be. (Cf. Eric von Hippel.) Platforms encourage reuse of core components. (cf. Boudreau 2010; Rysman and Simcoe 2008) [I am not getting all of Karim’s citations. Not even close.]

Another approach looks at innovation as a problem-solving process. And that entails search. You need to search to find the best solutions in an uncertain space. Sometimes innovators use “novel combinations of existing knowledge” to find the best solutions. So let’s look at the paths by which innovators come up with ideas. There’s a line of research that assumes that the paths are the essential element to understand the innovation process.

Mathematical formulations of this show you want lots of people searching independently. The broader the better for innovation outcomes. But there is a tendency of the researchers to converge on the initially successful paths. These are affected by decisions about when to disclose.

So, Karim and Kevin Boudreau implemented a field experiment. They used TopCoder, offering $6K, to set up a Med School project involving computational biology. The project let them get fine-grained info about what was going on over the two weeks of the contest.

700 people signed up. They matched them on skills and randomized them into three different disclosure treatments. 1. Standard contest format, with a prize at the end of each week. (Submissions were automatically scored, and the first week prizes went to the highest at that time.) 2. Submitted code was instantly posted to a wiki where anyone could use it. 3. In the first week you work without disclosure, but in the second week submissions were posted to the wiki.

For those whose work is disclosed: You can find and see the most successful. You can get money if your code is reused. In the non-disclosure regime you cannot observe solutions and all communications are bared. In both cases, you can see market signals and who the top coders are.

Of the 733 signups from 69 different countries, 122 coders submitted 654 submissions, with 89 different approaches. 44% were professionals; 56% were students. The skewed very young. 98% men. They spent about 10 hours a week, which is typical of Open Source. (There’s evidence that women choose not to participate in contests like this.) The results beat the NIH’s approach to the problem which was developed at great cost over years. “This tells me that across our economy there are lots of low-performing” processes in many institutions. “This works.”

What motivated the participants? Extrinsic motives matter (cash, job market signals) and intrinsic motives do too (fun, etc.). But so do prosocial motives (community belonging, identity). Other research Karim has done shows that there’s no relation between skills and motives. “Remember that in contests most people are losing, so there have to be things other than money driving them.”

Results from the experiment: More disclosure meant lower participation. Also, more disclosure correlated with the hours worked going down. The incentives and efforts are lower when there’s intermediate disclosure. “This is contrary to my expectations,”Karim says.

Q: In the intermediate disclosure regime is there an incentive to hold your stuff back until the end when no one else can benefit from it?

A: One guy admitted to this, and said he felt bad about it. He won top prize in the second week, but was shamed in the forums.

In the intermediate disclosure regime, you get better performance (i.e., better submission score). In the mixed experiment, performance shot up in the second week once the work of others was available.

They analyzed the ten canonical approaches and had three Ph.D.s tag the submissions with those approaches. The solutions were combinations of those ten techniques.

With no intermediate disclosures, the search patterns are chaotic. With intermedia disclosures, there is more convergence and learning. Intermediate disclosure resulted in 30% fewer different approaches. The no-disclsoure folks were searching in the lower-performance end of the pool. There was more exploration and experimentation in their searches when there was no intermediate disclosure, and more convergence and collaboration when there is.

Increased reuse comes at the cost of incentives. The overall stock of knowledge created is low, although the quality is higher. More convergent behavior comes with intermediate disclosures, which relies on the stock of knowledge available. The fear is that with intermediate disclosure , people will get stuck on local optima — path dependnce is a real risk in intermediate disclosure.

There are comparative advantages of the two systems. Where there is a broad stock of knowledge, intermediate disclosure works best. Plus the diversity of participants may overcome local optima lock-in. Final disclosure [i.e., disclosure only at the end] is useful where there’s broad-based experimentation. “Firms have figured out how to play both sides.” E.g., Apple is closed but also a heavy participant in Open Source.

Q&A

Q: Where did the best solutions come from?

A: From intermediate disclosure. The winner came from there, and then the next five were derivative.

Q: How about with the mixed?

A: The two weeks tracked the results of the final and intermediate disclosure regimes.

Q: [me] How confident are you that this applies outside of this lab?

A: I think it does, but even this platform is selecting on a very elite set of people who are used to competing. One criticism is that we’re using a platform that attracts competitors who are not used to sharing. But rank-order based platforms are endemic throughout society. SATs, law school tests: rank order is endemic in our society. In that sense we can argue that there’s a generalizability here. Even in Wikipedia and Open Source there is status-based ranking.

Q: Can we generalize this to systems where the outputs of innovation aren’t units of code, but, e.g., educational systems or municipal govts?

Q: We study coders because we can evaluate their work. But I think there are generalizations about how to organize a system for innovation, even if the outcome isn’t code. What inputs go into your search processes? How broad do you do?

Q: Does it matter that you have groups that are more or less skilled?

A: We used the Topcoder skill ratings as a control.

Q: The guy who held back results from the Intermediate regime would have won in real life without remorse.

A: Von Hippel’s research says that there are informal norms-based rules that prevent copying. E.g., chefs frown on copying recipes.

Q: How would you reform copyright/patent?

A: I don’t have a good answer. My law professor friends say the law has gone too far to protect incentives. There’s room to pull that back in order to encourage reuse. You can ask why the Genome Project’s Bermuda Rules (pro disclosure) weren’t widely adopted among academics. Academics’ incentives are not set up to encourage automatic posting and sharing.

Q: The Human Genome Project resulted in a splintering that set up a for-profit org that does not disclose. How do you prevent that?

A: You need the right contracts.

This was a very stimulating talk. I am a big fan of Karim and his work.


Afterwards Karim and I chatted briefly about whether the fact that 98% of Topcoder competitors are men raises issues about generalizing the results. Karim pointed to the general pervasiveness of rank-ordered systems like the one at TopCoder. That does suggest that the results are generalizable across many systems in our culture. Of course, there’s a risk that optimizing such systems might result in less innovation (using the same measures) than trying to open those systems up to people averse to them. That is, optimizing for TopCoder-style systems for innovation might create a local optima lock-in. For example, if the site were about preparing fish instead of code, and Japanese chefs somehow didn’t feel comfortable there because of its norms and values, how much could you conclude about optimizing conditions for fish innovation? Whereas, if you changed the conditions, you’d likely get sushi-based innovation that the system otherwise inadvertently optimized against.


[Note: 1. Karim’s point in our after-discussion was purely about the generalizability of the results, not about their desirability. 2. I’m trying to make a narrow point about the value of diversity of ideas for innovation processes, and not otherwise comparing women and Japanese chefs.]

Be the first to comment »

December 3, 2013

[berkman] Jérôme Hergeux on the motives of Wikipedians

Jérôme Hergeux is giving a Berkman lunch talk on “Cooperation in a peer prodiuction economy: experimental evidence from Wikipedia.” He lists as co-authors: Yann Algan, Yochai Benkler, and Mayo Fuster-Morell.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Jérôme explains the broader research agenda behind the paper. People are collaborating on the Web, sometimes on projects that compete with or replace major products from proprietary businesses and institutions. Standard economic theory doesn’t have a good way of making sense of this with its usual assumptions of behavior guided by perfect rationality and self-interest. Instead, Jérôme will look at Wikipedia where people are not paid and their contributions have no signaling value on the labor market. (Jérôme quotes Kizor: “The problem with Wikipedia is that it only works in practice. In theory it can never work.”)

Instead we should think of contributing to Wikipedia as a Public Goods dilemma: contributing has personal cost and not enough countervailing personal benefit, but it has a social benefit higher than the individual cost. The literature has mainly focused on the “prosocial preferences” that lead people to include the actions/interets of others, which leads them to overcome the Public Goods dilemma.

There are three classes of models commonly used by economists to explain prosocial behavior:

First, the altruism motive. Second, reciprocity: you respond in kind to kind actions of others. Third, “social image”: contributing to the public good signals something that brings you other utility. (He cites Napoleon: “Give me enough meals and I will win you any war.”)

His research’s method: Elicit the social prefs of a representative sample of Wikipedia contributors via an online experiment, and use those preferences to predict subjects’ field contributions to the Wikipedia project.

To check the reciprocity motive, they ran a simple public goods game. Four people in a group. Each has $10. Each has to decide how much to invest in a public project. You get some money back, but the group gets more. You can condition your contribution on the contributions of the other group members. This enables the researchers to measure how much the reciprocity motive matters to you. [I know I’m not getting this right. Hard to keep up. Sorry.] They also used a standard online trust game: You get some money from a partner, and can respond in kind.

Q: Do these tests correlate with real world behavior?

A: That’s the point of this paper. This is the first comprehensive test of all three motives.

For studying altruism, the dictator game is the standard. The dictator can give as much as s/he wants to the other person. The dictator has no reason to transfer the money. This thus measures altruism. But people might contribute to Wikipedia out of altruism just to their own Wikipedia in-group, not general altruism (“directed altruism”). So they ran another game to measure in-group altruism.

Social image is hard to measure experimentally, so they relied on observational data. “Consider as ‘social signalers’ subjects who have a Wikipedia user page whose size is bigger than the median in the sample.” You can be a quite engaged contributor to Wikipedia and not have a personal user page. But a bigger page means more concern with social image. Second, they looked at Barnstars data. Barnstars are a “social rewarding practice” that’s mainly restricted to heavy contributors: contribute well to a Wikipedia article and you might be given a barnstar. These shows up on Talk pages. About half of the people move it to their user page where it is more visible. If you move one of those awards manually to your user page, Jérôme will count you as a social signaller, i.e., someone who cares about his/her image.

He talks about some of the practical issues they faced in doing this experiment online. They illustrated the working of each game by using some simple Flash animations. And they provided calculators so you could see the effect of your decisions before you make them.

The subject pool came from registered Wikipedia users, and looked at the number of edits the user has made. (The number of contributions at Wikipedia follows a strong power law distribution.) 200,000 people register at Wikipedia account each month (2011) but only 2% make ten contributions in the their first month, and only 10% make one contribution or more within the next year. So, they recruited the cohort of new Wikipedia contributors (190,000 subjects), the group of engaged Wikipedia contributors (at least 300 edits) (18,989), and Wikipedia administrators (1,388 subjects). To recruit people, they teamed up with the Wikimedia Foundation to put a banner up on a Wikipedia page if the user met the criteria as a subject. The banner asked the reader to help with research. If readers click through, they go to the experiment page where they are paid in real money if they complete the 25 minute experiment within eight hours.

The demographics of the experiment’s subjects (1,099) matched quite closely the overall demographics of those subject pools. (The pool had 9% women, and the experiment had 8%).

Jérôme shows the regression tables and explains them. Holding the demographics steady, what is the relation between the three motives and the number of contributions? For the altruistic motive, there is no predictive power. Reciprocity in both games (public and trust) is a highly significant predictive. This tells us that reciprocal preference can lead you from being a non-contributor to being an engaged contributor; once you’re an engaged contributor, it doesn’t predict how far you’re going to go. Social image is correlated with the number of contributions; 81% of people who have received barnstars are super-contributors. Being a social signaler is associated with a 130% rise in the number of contributions you make. By both user-page length and barnstar, social image motivates for more contributions even among super-contributors.

Reciprocity incentivizes contributions only for those who are not concerned about their social image. So, reciprocity and social image are both at play among the contributors, but among separate groups. I.e., if you’re motivated by reciprocity, you are likely not motivated by social image, and vice versa.

Now Jérôme focuses on Wikipedia administrators. Altruism has no predictive value. But Wikipedia participation is negatively associated with reciprocity; perhaps this is because admins have to have thick skins to deal with disruptive users. For social image, the user page has significant revelance for admins, but not barnstars. Social image is less strong among admins than among other contributors.

Jérôme now explores his “thick skin hypothesis” to explain the admin results. In the trust game, look at how much the trustor decides how much to give to the stranger/partner. Jérôme ’s hypothesis: Among admins, those who decide to perform more of their policing role will be less trusting of strangers. There’s a negative correlation among admins between the results from the trust game and their contributions. The more time they say they do admin edits, the less trusting they are of strangers in the tests. That sort of make sense, says Jérôme. These admins are doing a valuable job for which they have self-selected, but it requires dealing with irritating people.

QA

Q: Maybe an admin is above others and is thus not being reciprocated by the group.

A: Perfectly reasonable explanation, and it is not ruled out by the data.

Q: Did you come into this with an idea of what might motivate the Wikipedians?

A: These are the three theories that are prevalent. We wanted to see how well they map onto actual field behavior.

Q: Maybe the causation goes the other way: working in Wikipedia is making people more concerned about social image or reciprocity?

A: The correlations could go in either direction. But we want to know if those explanations actually match what people do in the field.

Q: Heather Ford looks at why articles are deleted for non-Western topics. She found the notability criteria change for people not close to the topics. Maybe the motives change depending on how close you are to the event.

A: Sounds fascinating.

Q: Admins have an inherent bias in that they focus on the small percentage of contributors who are annoying jerks. If you spend your time working with jerks, it affects your sense of trust.

A: Good point. I don’t have the data to answer it.

Q: [me] If I’m a journalist I’m likely to take away the wrong conclusions from this talk, so I want to make sure I’m understanding. For example, I might conclude that Wikipedia admins are not motivated by altruism, whereas the right conclusion is (isn’t it?) that the standard altruism test doesn’t really measure altruism. Why not ask for self-reports to see?

A: Economists are skeptical about self-reports. If the reciprocity game predicts a correlation, that’s significant.

Yochai Benkler: Altruism has a special meaning among economists. It refers to any motivation other than “What’s in it for me?” [Because I asked the question, I didn’t do a good job recording the answers. Sorry.]

Q: Aren’t admins control freaks?

A: I wouldn’t say that. But control is not a pro-social motive, and I wanted to start with the theories that are current.

Q: You use the number of words someone writes on a user page as a sign of caring about social image, but this is in an context where people are there to write. And you’re correlating that to how much they write as editors and contributors. Maybe people at Wikipedia like to write. And maybe they write in those two different places for different reasons. Also, what do you do with these findings? Economists like to figure out which levers we pull if we’re not getting enough contributors.

Q: This sort of data seems to work well for large platforms with lots of users. What’s the scope of the methods you’re using? Only the top 100 web sites in the world?

A: I’d like to run this on all the peer production platforms in the world. Wikipedia is unusual if only because it’s been so successful. We’re already working on another project with 1,000 contributors at SourceForge especially to look at the effects of money, since about half of Open Source contributions are for money.


Fascinating talk. But it makes me want to be very dumb about it, because, well, I have no choice. So, here goes.

We can take this research as telling us something about Wikipedians’ motivations, about whether economists have picked the right three prosocial motivations, or about whether the standard tests of those motivations actually correlate to real-world motivations. I thought the point had to do with the last two alternatives and not so much the first. But I may have gotten it wrong.

So, suppose instead of talking about altruism, reciprocity, and social image we instead talk about the correlation between the six tests the researchers used and Wikipedia contributions. We would then have learned that Test #1 is a good predictor of the contribution levels of beginner Wikipedians, Test #2 predicts contributions by admins, Test #3 has a negative correlation with contributions by engaged Wikipedians, etc. But that would be of no interest, since we have (ex hypothesis) not made any assumptions about what the tests are testing for. Rather, the correlation would be a provocation to more research: why the heck does playing one of these odd little games correlate to Wikipedian productivity? It’d be like finding out that Wikipedian productivity is correlated to being a middle child or to wearing rings on both hands. How fascinating!… because these correlations have no implied explanatory power.

Now let’s plug back in the English terms that indicate some form of motivation. So now we can say that Test #3 shows that scoring high in altruism (in the game) does not correlate with being a Wikipedia admin. From this we can either conclude that Wikipedia admins are not motivated by altruism, or that the game fails to predict the existing altruism among Wikipedia admins. Is there anything else we can conclude without doing some independent study of what motivates Wikipedia admins? Because it flies in the face of both common sense and my own experience of Wikipedia admins; I’m pretty convinced one reason they work so hard is so everyone can have a free, reliable, neutral encyclopedia. So my strong inclination – admittedly based on anecdote and “common sense” (= “I believe what I believe!”) – is to conclude that any behavioral test that misses altruism as a component of the motivation of someone who spends thousands of hours working for free on an open encyclopedia…well, there’s something hinky about that behavioral test.

Even if the altruism tests correlate well with people engaged in activities we unproblematically associate with altruism – volunteering in a soup kitchen, giving away much of one’s income – I’d still not conclude from the lack of correlation with Wikipedia admins that those admins are not motivated by altruism, among other motivations. It just doesn’t correlate with the sort of altruism the game tests for. Just ask those admins if they’d put in the same amount of time creating a commercial encyclopedia.

So, I come out of Jérôme’s truly fascinating talk feeling like I’ve learned more about the reliability of the tests than about the motivations of Wikipedians. Based on Jérôme’s and Yochai’s responses, I think that’s what I’m supposed to have learned, but the paper also seems to be putting forward interesting conclusions (e.g., admins are not trusting types) that rely upon the tests not just correlating with the quantity of edits, but also being reliable measures of altruism, self-image, and reciprocity as motives. I assume (and thus may be wrong) that’s why Jérôme offered an hypothesis to explain the lack-of-trust result, rather than discounting the finding that admins lack trust (to oversimplify it).

(Two concluding comments: 1. Yochai’s The Leviathan and the Penguin uses behavioral tests like these, as well as case studies and observation, to make the case that we are a cooperative species. Excellent, enjoyable book. (Here’s a podcast interview I did with him about it.) 2. I’m truly sorry to be this ignorant.)

1 Comment »

November 20, 2013

[liveblog][2b2k] David Eagleman on the brain as networks

I’m at re comm 13, an odd conference in Kitzbühel, Austria: 2.5 days of talks to 140 real estate executives, but the talks are about anything except real estate. David Eagleman, a neural scientist at Baylor, and a well-known author, is giving a talk. (Last night we had one of those compressed conversations that I can’t wait to be able to continue.)

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

How do we know your thinking is in your brain? If you damage your finger, you don’t change, but damage to your brain can change basic facets of your life. “The brain is the densest representation of who you are.” We’re the only species trying to figure out our own progamming language. We’ve discovered the most complicated device in the universe: our own brains. Ten billion neurons. Every single neuron contains the entire human genome and thousands of protens doing complicated computations. Each neuron is is connected to tens of thousands of its neighbors, meaning there are 100s of trillions of connections. These numbers “bankrupt the language.”

Almost all of the operations of the brain are happening at a level invisible to us. Taking a drink of water requires a “lightning storm” of acvitity at the neural level. This leads us to a concept of the unconscious. The conscious part of you is the smallest bit of what’s happening in the brain. It’s like a stowaway on a transatlantic journey that’s taking credit for the entire trip. When you think of something, your brain’s been working on it for hours or days. “It wasn’t really you that thought of it.”

About the unconscious: Psychologists gave photos of women to men and asked them to evaluate how attractive they are. Some of the photos were the same women, but with dilated eyes. The men rated them as being more attractive but none of them noticed the dilation. Dilated eyes are a sign of sexual readiness in women. Men made their choices with no idea of why.

More examples: In the US, if your name is Dennis or Denise, you’re more likely to become a dentist. These dentists have a conscious narrative about why they became dentists that misses the trick their brain has played on them. Likewise, people are statistically more likely to marry someone whose first name begins with the same first letter as theirs. And, i you are holding a warm mug of coffee, you’ll describe the relationship with your mother as warmer than if you’re holding an iced cup. There is an enormous gap between what you’re doing and what your conscious mind is doing.

“We should be thankful for that gap.” There’s so much going on under the hood, that we need to be shielded from the details. The conscious mind gets in trouble when it starts paying attention to what it’s doing. E.g., try signing your name with both hands in opposite directions simultaneously: it’s easy until you think about it. Likewise, if you now think about how you steer when making a lane change, you’re likely to enact it wrong. (You actually turn left and then turn right to an equal measure.)

Know thyself, sure. But neuroscience teaches us that you are many things. The brain is not a computer with a single output. It has many networks that are always competing. The brain is like a parliament that debates an action. When deciding between two sodas, one network might care about the price, another about the experience, another about the social aspect (cool or lame), etc. They battle. David looks at three of those networks:

1. How does the brain make decisions about valuation? E.g., people will walk 10 mins to save 10 € on a 20 € pen but not on a 557 € suit. Also, we have trouble making comparisons of worth among disparate items unless they are in a shared context. E.g., Williams Sonoma had a bread baking machine for $275 that did not sell. Once they added a second one for $370, it started selling. In real estate, if a customer is trying to decide between two homes, one modern and one traditional, if you want them to buy the modern one, show them another modern one. That gives them the context by which they can decide to buy it.

Everything is associated with everything else in the brain. (It’s an associative network.) Coffee used to be $0.50. When Starbucks started, they had to unanchor it from the old model so they made the coffee houses arty and renamed the sizes. Having lost the context for comparison, the price of Starbucks coffee began to seem reasonable.

2. Emotional experience is a big part of decision making. If you’re in a bad-smelling room, you’ll make harsher moral decisions. The trolley dilemma: 5 people have been tied to the tracks. A trolley is approaching rapidly. You can switch the trolley to a track with only one person tied to it. Everyone would switch the trolley. But now instead, you can push a fat man onto the trolley to stop the car. Few would. In the second scenario, touching someone engages the emotional system. The first scenario is just a math problem. The logic and emotional systems are always fighting it out. The Greeks viewed the self as someone steering a chariot drawn by the white horse of reason and the black horse of passion. [From Plato’s Phaedrus]

3. A lot of the machinery of the brain deals with other brains. We use the same circuitry to think about people andor corporations. When a company betrays us, our brain responds the way it would if a friend betrayed us. Traditional economics says customer interactions are short-term but the brain takes a much longer-range view. Breaches of trust travel fast. (David plays “United Breaks Guitars.”) Smart companies use social media that make you believe that the company is your friend.

The battle among these three networks drives decisions. “Know thyselves.”

This is unsettling. The self is not at the center. It’s like when Galileo repositioned us in the universe. This seemed like a dethroning of man. The upside is that we’ve discovered the Cosmos is much bigger, more subtle, and more magnificent than we thought. As we sail into the inner cosmos of the brain, the brain is much subtle and magnificent than we ever considered.

“We’ve found the most wondrous thing in the universe, and it’s us.”

Q: Won’t this let us be manipulated?

A: Neural science is just catching up with what advertisers have known for 100 years.

Q: What about free will?

A: My labs and others have done experiments, and there’s no single experiment in neuroscience that proves that we do or do not have free will. But if we have free will, it’s a very small player in the system. We have genetics and experiences, and they make brains very different from one another. I argue for a legal system that recognizes a difference between people who may have committed the same crime. There are many different types of brains.

Be the first to comment »

November 15, 2013

[liveblog] Noam Chomsky and Bart Gellman at Engaging Data

I’m at the Engaging Data 2013conference where Noam Chomsky and Pulitzer Prize winner (twice!) Barton Gellman are going to talk about Big Data in the Snowden Age, moderated by Ludwig Siegele of the Economist. (Gellman is one of the three people Snowden vouchsafed his documents with.) The conference aims at having us rethink how we use Big Data and how it’s used.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

LS: Prof. Chomsky, what’s your next book about?

NC: Philosophy of mind and language. I’ve been writing articles that are pretty skeptical about Big Data. [Please read the orange disclaimer: I’m paraphrasing and making errors of every sort.]

LS: You’ve said that Big Data is for people who want to do the easy stuff. But shouldn’t you be thrilled as a linguist?

NC: When I got to MIT at 1955, I was hired to work on a machine translation program. But I refused to work on it. “The only way to deal with machine translation at the current stage of understanding was by brute force, which after 30-40 years is how it’s being done.” A principled understanding based on human cognition is far off. Machine translation is useful but you learn precisely nothing about human thought, cognition, language, anything else from it. I use the Internet. Glad to have it. It’s easier to push some buttons on your desk than to walk across the street to use the library. But the transition from no libraries to libraries was vastly greater than the transition from librarites to Internet. [Cool idea and great phrase! But I think I disagree. It depends.] We can find lots of data; the problem is understanding it. And a lot of data around us go through a filter so it doesn’t reach us. E.g., the foreign press reports that Wikileaks released a chapter about the secret TPP (Trans Pacific Partnership). It was front page news in Australia and Europe. You can learn about it on the Net but it’s not news. The chapter was on Intellectual Property rights, which means higher prices for less access to pharmaceuticals, and rams through what SOPA tried to do, restricting use of the Net and access to data.

LS: For you Big Data is useless?

NC: Big data is very useful. If you want to find out about biology, e.g. But why no news about TPP? As Sam Huntington said, power remains strongest in the dark. [approximate] We should be aware of the long history of surveillance.

LS: Bart, as a journalist what do you make of Big Data?

BG: It’s extraordinarily valuable, especially in combination with shoe-leather, person-to-person reporting. E.g., a colleague used traditional reporting skills to get the entire data set of applicants for presidential pardons. Took a sample. More reporting. Used standard analytics techniques to find that white people are 4x more likely to get pardons, that campaign contributors are also more likely. It would be likely in urban planning [which is Senseable City Labs’ remit]. But all this leads to more surveillance. E.g., I could make the case that if I had full data about everyone’s calls, I could do some significant reporting, but that wouldn’t justify it. We’ve failed to have the debate we need because of the claim of secrecy by the institutions in power. We become more transparent to the gov’t and to commercial entities while they become more opaque to us.

LS: Does the availability of Big Data and the Internet automatically mean we’ll get surveillance? Were you surprised by the Snowden revelations>

NC: I was surprised at the scale, but it’s been going on for 100 years. We need to read history. E.g., the counter-insurgency “pacification” of the Philippines by the US. See the book by McCoy [maybe this. The operation used the most sophisticated tech at the time to get info about the population to control and undermine them. That tech was immediately used by the US and Britain to control their own populations, .g., Woodrow Wilson’s Red Scare. Any system of power — the state, Google, Amazon — will use the best available tech to control, dominate, and maximize their power. And they’ll want to do it in secret. Assange, Snowden and Manning, and Ellsberg before them, are doing the duty of citizens.

BG: I’m surprised how far you can get into this discussion without assuming bad faith on the part of the government. For the most part what’s happening is that these security institutions genuinely believe most of the time that what they’re doing is protecting us from big threats that we don’t understand. The opposition comes when they don’t want you to know what they’re doing because they’re afraid you’d call it off if you knew. Keith Alexander said that he wishes that he could bring all Americans into this huddle, but then all the bad guys would know. True, but he’s also worried that we won’t like the plays he’s calling.

LS: Bruce Schneier says that the NSA is copying what Google and Yahoo, etc. are doing. If the tech leads to snooping, what can we do about it?

NC: Govts have been doing this for a century, using the best tech they had. I’m sure Gen. Alexander believes what he’s saying, but if you interviewed the Stasi, they would have said the same thing. Russian archives show that these monstrous thugs were talking very passionately to one another about defending democracy in Eastern Europe from the fascist threat coming from the West. Forty years ago, RAND released Japanese docs about the invasion of China, showing that the Japanese had heavenly intentions. They believed everything they were saying. I believe these are universals. We’d probably find it for Genghis Khan as well. I have yet to find any system of power that thought it was doing the wrong thing. They justify what they’re doing for the noblest of objectives, and they believe it. The CEOs of corporations as well. People find ways of justifying things. That’s why you should be extremely cautious when you hear an appeal to security. It literally carries no information, even in the technical sense: it’s completely predictable and thus carries no info. I don’t doubt that the US security folks believe it, but it is without meaning. The Nazis had their own internal justifications.

BG: The capacity to rationalize may be universal, but you’ll take the conversation off track if you compare what’s happening here to the Stasi. The Stasi were blackmailing people, jailing them, preventing dissent. As a journalist I’d be very happy to find that our govt is spying on NGOs or using this power for corrupt self-enriching purposes.

NC: I completely agree with that, but that’s not the point: The same appeal is made in the most monstrous of circumstances. The freedom we’ve won sharply restricts state power to control and dominate, but they’ll do whatever they can, and they’ll use the same appeals that monstrous systems do.

LS: Aren’t we all complicit? We use the same tech. E.g., Prof. Chomsky, you’re the father of natural language processing, which is used by the NSA.

NC: We’re more complicit because we let them do it. In this country we’re very free, so we have more responsibility to try to control our govt. If we do not expose the plea of security and separate out the parts that might be valid from the vast amount that’s not valid, then we’re complicit because we have the oppty and the freedom.

LS: Does it bug you that the NSA uses your research?

NC: To some extent, but you can’t control that. Systems of power will use whatever is available to them. E.g., they use the Internet, much of which was developed right here at MIT by scientists who wanted to communicate freely. You can’t prevent the powers from using it for bad goals.

BG: Yes, if you use a free online service, you’re the product. But if you use a for-pay service, you’re still the product. My phone tracks me and my social network. I’m paying Verizon about $1,000/year for the service, and VZ is now collecting and selling my info. The NSA couldn’t do its job as well if the commercial entities weren’t collecting and selling personal data. The NSA has been tapping into the links between their data centers. Google is racing to fix this, but a cynical way of putting this is that Google is saying “No one gets to spy on our customers except us.”

LS: Is there a way to solve this?

BG: I have great faith that transparency will enable the development of good policy. The more we know, the more we can design policies to keep power in place. Before this, you couldn’t shop for privacy. Now a free market for privacy is developing as the providers now are telling us more about what they’re doing. Transparency allows legislation and regulation to be debated. The House Repubs came within 8 votes of prohibiting call data collection, which would have been unthinkable before Snowden. And there’s hope in the judiciary.

NC: We can do much more than transparency. We can make use of the available info to prevent surveillance. E.g., we can demand the defeat of TPP. And now hardware in computers is being designed to detect your every keystroke, leading some Americans to be wary of Chinese-made computers, but the US manufacturers are probably doing it better. And manufacturers for years have been trying to dsign fly-sized drones to collect info; that’ll be around soon. Drones are a perfect device for terrorists. We can learn about this and do something about it. We don’t have to wait until it’s exposed by Wikileaks. It’s right there in mainstream journals.

LS: Are you calling for a political movement?

NC: Yes. We’re going to need mass action.

BG: A few months ago I noticed a small gray box with an EPA logo on it outside my apartment in NYC. It monitors energy usage, useful to preventing brown outs. But it measures down to the apartment level, which could be useful to the police trying to establish your personal patterns. There’s no legislation or judicial review of the use of this data. We can’t turn back the clock. We can try to draw boundaries, and then have sufficient openness so that we can tell if they’ve crossed those boundaries.

LS: Bart, how do you manage the flow of info from Snowden?

BG: Snowden does not manage the release of the data. He gave it to three journalists and asked us to use your best judgment — he asked us to correct for his bias about what the most important stories are — and to avoid direct damage to security. The documents are difficult. They’re often incomplete and can be hard to interpret.

Q&A

Q: What would be a first step in forming a popular movement?

NC: Same as always. E.g., the women’s movement began in the 1960s (at least in the modern movement) with consciousness-raising groups.

Q: Where do we draw the line between transparency and privacy, given that we have real enemies?

BG: First you have to acknowledge that there is a line. There are dangerous people who want to do dangerous things, and some of these tools are helpful in preventing that. I’ve been looking for stories that elucidate big policy decisions without giving away specifics that would harm legitimate action.

Q: Have you changed the tools you use?

BG: Yes. I keep notes encrypted. I’ve learn to use the tools for anonymous communication. But I can’t go off the grid and be a journalist, so I’ve accepted certain trade-offs. I’m working much less efficiently than I used to. E.g., I sometimes use computers that have never touched the Net.

Q: In the women’s movement, at least 50% of the population stood to benefit. But probably a large majority of today’s population would exchange their freedom for convenience.

NC: The trade-off is presented as being for security. But if you read the documents, the security issue is how to keep the govt secure from its citizens. E.g., Ellsberg kept a volume of the Pentagon Papers secret to avoid affecting the Vietnam negotiations, although I thought the volume really only would have embarrassed the govt. Security is in fact not a high priority for govts. The US govt is now involved in the greatest global terrorist campaign that has ever been carried out: the drone campaign. Large regions of the world are now being terrorized. If you don’t know if the guy across the street is about to be blown away, along with everyone around, you’re terrorized. Every time you kill an Al Qaeda terrorist, you create 40 more. It’s just not a concern to the govt. In 1950, the US had incomparable security; there was only one potential threat: the creation of ICBM’s with nuclear warheads. We could have entered into a treaty with Russia to ban them. See McGeorge Bundy’s history. It says that he was unable to find a single paper, even a draft, suggesting that we do something to try to ban this threat of total instantaneous destruction. E.g., Reagan tested Russian nuclear defenses that could have led to horrible consequences. Those are the real security threats. And it’s true not just of the United States.

1 Comment »

October 25, 2013

[dplafest] Advanced Research and the DPLA

I’m at a DPLAfest session. Jean Bauer (Digital Humanities Librarian, Brown U.), Jim Egan (English Prof, Brown), Kathryn Shaughnessy (Assoc. Prof, University Libraries, St. John’s U), and David Smth (Ass’t Prof CS, Northeastern).

Rather than liveblogging in this blog, I contributed to the collaboratively-written Google Doc designated for the session notes. It’s here.

Be the first to comment »

[dplafest] Dan Cohen opens DPLA meeting

Dan Cohen has some announcements in his welcome to the DPLAfest.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

The collection now has 5M items. These come from partner hubs (large institutions) and service hubs (aggregations of smaller providers). Three new hubs have joined, bringing the total to nine, from NY, North Carolina, and Texas. Dan stresses the diversity of contributors.

The DPLA sends visitors back to the contributing organizations. E.g., Minnesota Reflections is up 55% in visitors and 62% in unique visitors over the year since it joined the DPLA.

He also announces the DPLA Bookshelf, which is a contribution from the Harvard Library Innovation Lab that I co-direct. It’s an embedded version of the Stacklife browser, which you can see by going to DP.LA and searching for a book. (You can use the Harvard version here.

Dan announces a $1M grant from the Bill & Melinda Gates Foundation, to help local libraries curate material in the DPLA and start scanning in local collections. Also, an anonymous donor gave $450,000. [I don’t want to say who it was, but, well, you’re welcome.] Dan Cohen suggests we become a sponsor athttp://www.dp.la/donate. T-shirts and, yes, tote bags.

There have been 1,7M uses of the DPLA API as of September 2013. Examples of work already done:

Dan talks about DPA Local, and idea that would enable local communities to use the services the DPLA provides.

Dan says that all of the sessions have Google Docs already set up for collaborative note-taking [an approach I’m very fond of].

Be the first to comment »

October 20, 2013

[templelib] Temple Univ. library symposium

On Friday I had the pleasure and honor of attending a symposium about libraries as part of the inaugural festivities welcoming Temple University’s new president, Neil Theobald.

The event, put together by Joe Lucia, the Dean of Temple U. Library, featured an amazing set of library folks. It was awesome to have some time to hang out with such an accomplished group of people who not only share values, but share values that are so core to our culture.

I liveblogged the talks, with my usual unreliable haphazardness and cavalier attitude toward accuracy and comprehension. Here are the links, in chronological order (which of course is the reverse of blogological order):

  1. James Neal: 26 truths about libraries

  2. Siobhan Reardon: Renewing Philadelphia’s public libraries

  3. Nancy Kranich: Engaging the academic community

  4. Rachel Frick: Innovation outward

  5. Anne Kenney: Cornell’s hiphop collection

  6. Bryn Geffert: Libraries as publishers

  7. Charles Watkinson: Univ. press partnerships

  8. Craig Dykers: Architecting libraries

I led off the session with a talk about why the networking of knowledge and ideas, especially in college communities, should encourage libraries to develop themselves as platforms in addition to being portals.

1 Comment »

[templelib] Charles Watkinson: “The Library in the Digital Age”

At Temple University’s symposium in honor of the inauguration of the University’s new president, on Oct. 18, 2013.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Charles Watkinson is Director, Purdue Univ. Press. He says he wishes everyone were like Bryn [see prior post]. But univ. presses generally only receive 15% of their income from the university. So, Bryn’s model isn’t generally applicable.

His toddlers watch Dinosaur Train. “I know you perceive university presses as dinosaurs” but as in the show, some dinosaurs are different from others.

John Thompson in Books in the Digital Age talks about “publishing fields.” He says it’s complex but not without order. We’re seeing the emergence of several different mission-driven publishers: university presses, scholarly societies, library presses. He will talk about univ and library presses. (He points to Envisioning Emancipation as a univ. press at its best.) He goes through some of the similarities and differences between the two presses.

He takes as a case study the Purdue U Press and Purdue Scholarly Publishing Services as an example of how these types of presses can be complementary. (He mentions Anne Kenney’s partnering of Cornell Library with DukePurdue U Press on Project Euclid.)

The aim, Charles says, is to meet the full spectrum of needs, ranging from pre-print to published books. He points to the differences in brand styles of the two and how they can be merged.

So, “What can we do together that we couldn’t do apart?”

“We can serve campus needs better.” He points to the Journal of Purdue Undergraduate Research, which combines library skills (instruction, assessment, institutional outreach) with publisher skills (solicitation for content, project management, editing, design).

Also, together they can support disciplines. E.g., Habri Central Library skills: bibliographic research, taxonomy, metadata, licensing, preservation. Publisher skills: financial management, acquisition of original content, marketing.

Also, solve issues in the system. E.g., the underlying data behind tech reports, e.g., JTRP. Library skills: digitization, metadata, online hosting, linked data, preservation. Publisher skills: peer review administration, process redesign, project management.

Questions for these merged entities: What disciplines can best be served together? How to build credibility? How to turn projects into programs? What is the future role of earned revenues? Will all products be Open Access? What is the sustainability plan for OA?

Maybe libraries should turn to university presses for advice and help with engagement since “that’s what university presses do.”

3 Comments »

Bryn Geffert: Libraries as publishers

At Temple University’s symposium in honor of the inauguration of the University’s new president, on Oct. 18, 2013.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Bryn Geffert is College Librarian at Amherst.

Imagine a biologist at Amherst who writes a science article. Who paid for her to write that article? Amherst. But who paid Amherst? Students. Alumni and donors. US funds.

Now it’s accepted by Elsevier. The biologist gives it to Elsevier as a gift, in effect. Elsevier charges Amherst $24,000/year for a subscription to this particular journal. It’s Looney Tunes, Bryn says. There isn’t a worse imaginable model.

Since 1986, serial [= journal] prices have increased 400%. Why? Because a few publishers have a monopoly: Wiley, Elsevier, Springer. With increasing prices for serials, libraries have less money for books. In 1986, academic libraries spent 46% of budgets on books. Now it’s down to 22%. And the effect on book publishers is even worse: when they can’t sell books to libraries, they shut down publishing in entire disciplinary fields. The average sales per academic book is now 200 copies. Since 1993, 5 disciplines have lost presses. E.g., the number of presses sserving British Lit have dropped by about half. More and more academic works are going to bad commercial presses — bad in that they don’t improve what they get.

These these are just the problems of wealthy institutions. How about the effect on developing countries? He gives three examples of work of direct relevance to local cultures where the local culture cannot afford to buy the work.

University presses are dying. Money to purchase anything except journals is dying. Academic presses are dying. And we’re paying no attention to the world around us.

Why does Amherst care? Their motto is “terras irradient”: light the world. But nothing in this model supports that model.

What do we have to do? He goes through these quickly because, he says, we are familiar with them:

  1. Open Access policies
  2. Legislation that mandates that federally supported research be Open Access
  3. Go after the monopolies that are violating anti-trust
  4. Libraries have to boycott offenders.

But even so, we need to design a new system.

Amherst is asking what the mission of a university press is. Part of it: make good work even better and make it as widely available as possible.

What is the mission of the academic libraries? Make good info as widely available as possible.

So, combine forces. U of Mich put its press under the library. This inspired Amherst. But Amherst doesn’t have a press. So, they’re creating one.

  • Everything will be online, Open Access (Creative Commons)

  • They will hustle to get manuscripts

  • All will be peer reviewed and rigorously edited

But how will they pay for it? Amherst’s Frost Library is giving two positions to the press. In return, those editors will solicit manuscripts. The President will raise money to endow a chair of the editor of the press. They’ll take some money from the Library to pay freelancers for copy-editing. Some other units at Amherst are kicking in other services, including design and building an online platform.

People say this is too small to make a difference. But other schools are starting to do similar things. This means that Amherst is a recipient of free content from them. Bryn can imagine a time when there’s so much OA content that the savings realized offset the costs of publishing OA content.

The goal is to move away from individual presses looking out for their own interests to one in which there’s free sharing. “I want to see a world in which the students at a university in Nairobi have access to the same information as students at Columbia.”

3 Comments »

[templelib] Rachel Frick. Digital Library Federation

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

At Temple University’s symposium in honor of the inauguration of the University’s new president, on Oct. 18, 2013.

[I came in late. Sorry!!]

Rachel Frick is talking about the importance of the Commons. Too often, she says, librarians come into the conversation as if they’re from a bounded place. We keep producing the same solutions to different problems. (She recommends Steven John’s Where Good Ideas Come From. She earlier recommend Networked by Lee Rainie and Barry Wellman. [I concur with these recommendations!])

Rachel says she likes SxSW for idea sharing. She was talking with Bonnie Tijerina and they came up with the idea of the Idea Drop house for librarians at SxSW for livestreaming conversations. [I did one last year! It was a very cool venue: an AirBnB residence with librarians and refreshments. What more could you want?] They had 800+ visitors. [*This is even more impressive since the house was not on the main campus of SxSW.]

She worked with DPLA, Europeana and OpenGLAM on “Culture Hack”: use our data! Also meetups at SxSW. Also, LibraryBox: an instant wifi distribution point run on a battery for distribution of library content. They used it to distribute tons of open content at the conference. It was a great way to engage people in conversation about libraries.

Jason Griffey wanted to upgrade the LibraryBoxes. He needed about $3K. He needed to make a case for its need. So what are some non-ilbrary-centric use cases? Health care info in remote areas. Unmonitored conversations. He raised $13K in 4 days on Kickstarter. At the end of 30 days, he’d raised $33K. Because he could reach beyond the library space, and because it spoke to open access to info, it succeeded.

Now is the time for creators and makers, she says. Bess Sadler talks about the hacker epistemology: adopt a problem solving mindset, the truth is what works, solve for interesting. Bethany Nowviskie at Code4Lib a few years ago talked about the creative mindset: meticulous, practical, an impulse to build and maintain, and to suffer fools gladly. Kathy Sierra talks about how you get over The Big Frickin’ Wall between incremental changes and transformation. John Voss, who works for HistoryPin [and organizer of LODLAM], says you get over the wall by connecting what we do to a greater purpose.

“The mission of librarians is to improve society through facilitating knowledge creation in their communities” David Lankes, Atlas of New Librarianship. This is how Linked Data will be made real, Rachel says. She cites the LODLAM conference, and DPLA: intracommunity conversation.

Be the first to comment »

« Previous Page | Next Page »