Joho the Blogsocial science Archives - Joho the Blog

January 2, 2014

[2b2k] Social Science in the Age of Too Big to Know

Gary King [twitter:kinggarry] , Director of Harvard’s Institute for Quantitative Social Science, has published an article (Open Access!) on the current status of this branch of science. Here’s the abstract:

The social sciences are undergoing a dramatic transformation from studying problems to solving them; from making do with a small number of sparse data sets to analyzing increasing quantities of diverse, highly informative data; from isolated scholars toiling away on their own to larger scale, collaborative, interdisciplinary, lab-style research teams; and from a purely academic pursuit focused inward to having a major impact on public policy, commerce and industry, other academic fields, and some of the major problems that affect individuals and societies. In the midst of all this productive chaos, we have been building the Institute for Quantitative Social Science at Harvard, a new type of center intended to help foster and respond to these broader developments. We offer here some suggestions from our experiences for the increasing number of other universities that have begun to build similar institutions and for how we might work together to advance social science more generally.

In the article, Gary argues that Big Data requires Big Collaboration to be understood:

Social scientists are now transitioning from working primarily on their own, alone in their officesâ??a style that dates back to when the offices were in monasteriesâ??to working in highly collaborative, interdisciplinary, larger scale, lab-style research teams. The knowledge and skills necessary to access and use these new data sources and methods often do not exist within any one of the traditionally defined social science disciplines and are too complicated for any one scholar to accomplish alone

He begins by giving three excellent examples of how quantitative social science is opening up new possibilities for research.

1. Latanya Sweeney [twitter:LatanyaSweeney] found “clear evidence of racial discrimination” in the ads served up by newspaper websites.

2. A study of all 187M registered voters in the US showed that a third of those listed as “inactive” in fact cast ballots, “and the problem is not politically neutral.”

3. A study of 11M social media posts from China showed that the Chinese government is not censoring speech but is censoring “attempts at collective action, whether for or against the government…”

Studies such as these “depended on IQSS infrastructure, including access to experts in statistics, the social sciences, engineering, computer science, and American and Chinese area studies. ”

Gary also points to “the coming end of the quantitative-qualitative divide” in the social sciences, as new techniques enable massive amounts of qualitative data to be quantified, enriching purely quantitative data and extracting additional information from the qualitative reports.

Instead of quantitative researchers trying to build fully automated methods and qualitative researchers trying to make do with traditional human-only methods, now both are heading toward using or developing computer-assisted methods that empower both groups.

We are seeing a redefinition of social science, he argues:

We instead use the term “social science” more generally to refer to areas of scholarship dedicated to understanding, or improving the well-being of, human populations, using data at the level of (or informative about) individual people or groups of people.

This definition covers the traditional social science departments in faculties of schools of arts and science, but it also includes most research conducted at schools of public policy, business, and education. Social science is referred to by other names in other areas but the definition is wider than use of the term. It includes what law school faculty call “empirical research,” and many aspects of research in other areas, such as health policy at schools of medicine. It also includes research conducted by faculty in schools of public health, although they have different names for these activities, such as epidemiology, demography, and outcomes research.

The rest of the article reflects on pragmatic issues, including what this means for the sorts of social science centers to build, since community is “by far the most important component leading to success…” ” If academic research became part of the X-games, our competitive event would be “‘extreme cooperation'”.

1 Comment »

November 2, 2010

[berkman] Dave Rand on social science on the Net

Dave Rand is giving a lunchtime Berkman talk titled “The Online Laboratory: Taking Experimental Social Science onto the Internet.” It is based on a paper with John Dorton and Richard Zeckhauser called “The Online Laboratory: Conducting Experiments in a Real Labor Market,” which is available online (go to Dave’s homepage).

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

He begins by warning against mistaking correlation for causation. But, he says, you of course need more than that. For one thing, you need subjects, time and money. “In social psych, they use smart tricks.” In experimental economics (Dave’s field), they use monetary incentives. In field studies, they use surprise. The Internet can help in all these fields: Easy recruitment, many subjects, little effort. But few economists are using online experiments because it’s harder to pay people to participate … until online labor markets came along that make it easy to recruit and pay subjects.

Dave is going to focus on Amazon’s Mechanical Turk as an online labor market, although there are many other out there. The name comes from an 18th century chess playing robot that actually had a person hidden inside. Amazon’s MT farms out tasks to lots of people who are paid relatively little. They pay small amo9ounts of money for short tasks that are easy for humaans but hard for computers: labeling photos, completing surveys, etc. “Human Intelligence Tasks” = HITs. (Dave notes that he has no affiliation with Amazon.) The minimum reservation (= min amt you’re willing to work for) is $1.38. Dave usually pays $0.20- 0.40 flat rate per task, and bonus of up to a dollar for each 5 min task. Last time he did this, he got 1,500 people in 2.5 days.

Q: What about selection bias?
A: Most of this talk is about why using the Net is a reasonable approach to economics research.

Who are the Turkers? Modal age is around 30, with a tail going out through the 40s and 60s. 35% from US, 45% from India, and the rest from everywhere else. (Locations are self-reported.) You can limit respondents by country, over 18, and satisfactory previous work at MT. It’s completely anonymous to the experimenter. Amazon takes 10%. The motivation overwhelmingly is to make money, which Dave likes because you have a way of controlling the size of the incentive and because that’s how experimental economics works. Of course, you don’t know if they’re watching TV at the same time (or whatever).

Q: How much data do you have to throw away?
A: A fair bit. At the end we often ask a survey question that depends upon people having read the instructions. About 30% of the people fail.

Q: Is there going to be legislation or IRB regs to prevent paying people under the minimum wage?
A: I think most of what I’m running pay around min wage, but it’s a good question.
Q: Miriam Cherry is saying we ought to pay min wage.

Dave points out it’s really hard to use MT to do experiments that require feedback.

Education level of the Turkers: Most in the US and elsewhere have a bachelor’s degree. Most people not from the US are making less than $15,000/yr.

What’s great about it: Fast, cheap, easy, incentive-compatible, cross-cultural (with some strange bias — US Turkers may have diff motives than Indian Turkers, etc.) and great potential for field studies.

Isn’t this a biased sample? Yes, but most experimental economic studies are done on college undergrads. You get much more age, SES, geographic variation. You do lose control over what else people are doing while doing the experiment. So, you need replication studies.

He looks at an aggregation of results about the dictator game and the trust game, for dollar or for nothing, and it turns out that stakes don’t matter as much as economists thought.

Dave’s group has replicated various experiments. E.g., they found that priming works (read a religious passage before playing the prisoner’s dilemma), although it depends on the person (cooperation goes down for those who don’t believe in God and goes up for those who do).

Dave’s IRB does not require consent from the subjects since what h’s doing is so consistent with that goes on at MT.

Overall, Dave thinks it’s great. It opens the door to anyone with a research idea. OTOH, it may result in the “file drawer” problem: Only positive results get published. Journals ought to require scientists to say how many times they’ve run the experiment.

Dave goes through a few cool studies, which I will not get right. But I see Ethan Zuckerman liveblogging. Go there now.

Pitfalls: Turkers don’t pay attention. So, put in attentino-checking questinos. Likewise for understanding instructions. Also, non-random attrition, i.e., people who leave after the first couple of questions because they don’t like them. So, check for it, and give them an initial “hook task,” such as transcribing text, but they don’t get paid until they’ve done the second part.

People ask how it feels to run a sweatshop, but Dave hasn’t lost sleep about it. For one thing, he pays generously compared to MT wages.

Q: Does this crowd out people willing to donate work for free, e.g., at Wikipedia.

Q: Is it easier to believe you’re anonymous on Turk than if you’re doing real world experiments?
A: Don’t know. Could be an empirical study of that.