Joho the BlogMay 2017 - Joho the Blog

May 29, 2017

The Internet is an agreement

Jaap van Till has posted an aggregation of thoughts and links to remind us of what it seems we have so much trouble remembering: The Internet is not a thing but an agreement.

An internet, network of networks, is a voluntary agreement among network operators to exchange traffic for their mutual benefit. (The Internet is a prototype internet.) That’s all — it’s an agreement.

That’s from an earlier post by Jaap, which along the way links out to the World of Ends post that Doc Searls and I wrote in 2003 that aimed at explaining the Internet to legislators.

I sense that we are due for a shift in tides, maybe over the next two years, in which the point that needs making is not that the Internet is dangerous and sucks, but that it it is dangerous and sucks and is the greatest invention in the history of our species. Cf. Virginia Heffernan, Magic and Loss.)

This pendulum swing can’t come soon enough.

1 Comment »

May 18, 2017

Indistinguishable from prejudice

“Any sufficiently advanced technology is indistinguishable from magic,” said Arthur C. Clarke famously.

It is also the case that any sufficiently advanced technology is indistinguishable from prejudice.

Especially if that technology is machine learning. ML creates algorithms to categorize stuff based upon data sets that we feed it. Say “These million messages are spam, and these million are not,” and ML will take a stab at figuring out what are the distinguishing characteristics of spam and not spam, perhaps assigning particular words particular weights as indicators, or finding relationships between particular IP addresses, times of day, lenghts of messages, etc.

Now complicate the data and the request, run this through an artificial neural network, and you have Deep Learning that will come up with models that may be beyond human understanding. Ask DL why it made a particular move in a game of Go or why it recommended increasing police patrols on the corner of Elm and Maple, and it may not be able to give an answer that human brains can comprehend.

We know from experience that machine learning can re-express human biases built into the data we feed it. Cathy O’Neill’s Weapons of Math Destruction contains plenty of evidence of this. We know it can happen not only inadvertently but subtly. With Deep Learning, we can be left entirely uncertain about whether and how this is happening. We can certainly adjust DL so that it gives fairer results when we can tell that it’s going astray, as when it only recommends white men for jobs or produces a freshman class with 1% African Americans. But when the results aren’t that measurable, we can be using results based on bias and not know it. For example, is anyone running the metrics on how many books by people of color Amazon recommends? And if we use DL to evaluate complex tax law changes, can we tell if it’s based on data that reflects racial prejudices?[1]

So this is not to say that we shouldn’t use machine learning or deep learning. That would remove hugely powerful tools. And of course we should and will do everything we can to keep our own prejudices from seeping into our machines’ algorithms. But it does mean that when we are dealing with literally inexplicable results, we may well not be able to tell if those results are based on biases.

In short: Any sufficiently advanced technology is indistinguishable from prejudice.[2]

[1] We may not care, if the result is a law that achieves the social goals we want, including equal and fair treatment of tax players regardless of race.

[2] Please note that that does not mean that advanced technology is prejudiced. We just may not be able to tell.

Comments Off on Indistinguishable from prejudice

May 15, 2017

[liveblog][AI] AI and education lightning talks

Sara Watson, a BKC affiliate and a technology critic, is moderating a discussion at the Berkman Klein/Media Lab AI Advance.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Karthik Dinakar at the Media Lab points out what we see in the night sky is in fact distorted by the way gravity bends light, which Einstein called a “gravity lens.” Same for AI: The distortion is often in the data itself. Karthik works on how to help researchers recognize that distortion. He gives an example of how to capture both cardiologist and patient lenses to better to diagnose women’s heart disease.

Chris Bavitz is the head of BKC’s Cyberlaw Clinic. To help Law students understand AI and tech, the Clinic encourages interdisciplinarity. They also help students think critically about the roles of the lawyer and the technologist. The clinic prefers early relationships among them, although thinking too hard about law early on can diminish innovation.

He points to two problems that represent two poles. First, IP and AI: running AI against protected data. Second, issues of fairness, rights, etc.

Leah Plunkett, is a professor at Univ. New Hampshire Law School and is a BKC affiliate. Her topic: How can we use AI to teach? She points out that if Tom Sawyer were real and alive today, he’d be arrested for what he does just in the first chapter. Yet we teach the book as a classic. We think we love a little mischief in our lives, but we apparently don’t like it in our kids. We kick them out of schools. E.g., of 49M students in public schools in 20-11, 3.45M were suspended, and 130,000 students were expelled. These disproportionately affect children from marginalized segments.

Get rid of the BS safety justification and the govt ought to be teaching all our children without exception. So, maybe have AI teach them?

Sarah: So, what can we do?

Chris: We’re thinking about how we can educate state attorneys general, for example.

Karthik: We are so far from getting users, experts, and machine learning folks together.

Leah: Some of it comes down to buy-in and translation across vocabularies and normative frameworks. It helps to build trust to make these translations better.

[I missed the QA from this point on.]

Comments Off on [liveblog][AI] AI and education lightning talks

[liveblog][AI] Perspectives on community and AI

Chelsea Barabas is moderating a set of lightning talks at the AI Advance, aat Berkman Klein and MIT Media Lab.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Lionel Brossi recounts growing up in Argentina and the assumption that all boys care about football. He moved to Chile which is split between people who do and do not watch football. “Humans are inherently biased.” So, our AI systems are likely to be biased. Cognitive science has shown that the participants in their studies tend to be WEIRD: western, educated, industrialized, rich and developed. Also straight and white. He references Kate Crawford‘s “AI’s White Guy Problem.” We need not only diverse teams of developers, but also to think about how data can be more representative. We also need to think about the users. One approach is work on goal centered design.

If we ever get to unbiased AI, Borges‘ statement, “The original is unfaithful to the translation” may apply.

Chelsea: What is an inclusive way to think of cross-border countries?

Lionel: We need to co-design with more people.

Madeline Elish is at Data and Society and an anthropology of technology grad student at Columbia. She’s met designers who thought it might be a good to make a phone run faster if you yell at it. But this would train children to yell at things. What’s the context in which such designers work? She and Tim Hwang set about to build bridges between academics and businesses. They asked what designers see as their responsibility for the social implications of their work. They found four core challenges:

1. Assuring users perceive good intentions
2. Protecting privacy
3. Long term adoption
4. Accuracy and reliability

She and Tim wrote An AI Pattern Language [pdf] about the frameworks that guide design. She notes that none of them were thinking about social justice. The book argues that there’s a way to translate between the social justice framework and, for example, the accuracy framework.

Ethan Zuckerman: How much of the language you’re seeing feels familiar from other hype cycles?

Madeline: Tim and I looked at the history of autopilot litigation to see what might happen with autonomous cars. We should be looking at Big Data as the prior hype cycle.

Yarden Katz is at the BKC and at the Dept. of Systems Biology at Harvard Medical School. He talks about the history of AI, starting with 1958 claim about translation machine. 1966: Minsky Then there was an AI funding winter, but now it’s big again. “Until recently, AI was a dirty word.”

Today we use it schizophrenically: for Deep Learning or in a totally diluted sense as something done by a computer. “AI” now seems to be a branding strategy used by Silicon Valley.

“AI’s history is diverse, messy, and philosophical.” If complexit is embraced, “AI” might not be a useful caregory for policy. So we should go basvk to the politics of technology:

1. who controls the code/frameworks/data
2. Is the system inspectable/open?
3. Who sets the metrics? Who benefits from them?

The media are not going to be the watchdogs because they’re caught up in the hype. So who will be?

Q: There’s a qualitative difference in the sort of tasks now being turned over to computers. We’re entrusting machines with tasks we used to only trust to humans with good judgment.

Yarden: We already do that with systems that are not labeled AI, like “risk assessment” programs used by insurance companies.

Madeline: Before AI got popular again, there were expert systems. We are reconfiguring our understanding, moving it from a cognition frame to a behavioral one.

Chelsea: I’ve been involved in co-design projects that have backfired. These projects have sometimes been somewhat extractive: going in, getting lots of data, etc. How do we do co-design that are not extractive but that also aren’t prohibitively expensive?

Nathan: To what degree does AI change the dimensions of questions about explanation, inspectability, etc.

Yarden: The promoters of the Deep Learning narrative want us to believe you just need to feed in lots and lots of data. DL is less inspectable than other methods. DL is not learning from nothing. There are open questions about their inductive power.


Amy Zhang and Ryan Budish give a pre-alpha demo of the AI Compass being built at BKC. It’s designed to help people find resources exploring topics related to the ethics and governance of AI.

Comments Off on [liveblog][AI] Perspectives on community and AI

[liveblog] AI Advance opening: Jonathan Zittrain and lightning talks

I’m at a day-long conference/meet-up put on by the Berkman Klein Center‘s and MIT Media Lab‘s “AI for the Common Good” project.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Jonathan Zittrain gives an opening talk. Since we’re meeting at Harvard Law, JZ begins by recalling the origins of what has been called “cyber law,” which has roots here. Back then, the lawyers got to the topic first, and thought that they could just think their way to policy. We are now at another signal moment as we are in a frenzy of building new tech. This time we want instead to involve more groups and think this through. [I am wildly paraphrasing.]

JZ asks: What is it that we intuitively love about human judgment, and are we willing to insist on human judgments that are worse than what a machine would come up with? Suppose for utilitarian reasons we can cede autonomy to our machines — e.g., autonomous cars — shouldn’t we? And what do we do about maintaining local norms? E.g., “You are now entering Texas where your autonomous car will not brake for pedestrians.”

“Should I insist on being misjudged by a human judge because that’s somehow artesinal?” when, ex hypothesis, an AI system might be fairer.

Autonomous systems are not entirely new. They’re bringing to the fore questions that have always been with us. E.g., we grant a sense of discrete intelligence to corporations. E.g., “McDonald’s is upset and may want to sue someone.”

[This is a particularly bad representation of JZ’s talk. Not only is it wildly incomplete, but it misses the through-line and JZ’s wit. Sorry.]

Lightning Talks

Finale Doshi-Velez is particularly interested in interpretable machine learning (ML) models. E.g., suppose you have ten different classifiers that give equally predictive results. Should you provide the most understandable, all of them…?

Why is interpretability so “in vogue”? Suppose non-interpretable AI can do something better? In most cases we don’t know what “better” means. E.g., someone might want to control her glucose level, but perhaps also to control her weight, or other outcomes? Human physicians can still see things that are not coded into the model, and that will be the case for a long time. Also, we want systems that are fair. This means we want interpretable AI systems.

How do we formalize these notions of interpretability? How do we do so for science and beyond? E.g., what is a legal “right to explanation
” mean? She is working with Sam Greshman on how to more formally ground AI interpretability in the cognitive science of explanation.

Vikash Mansinghka leads the eight-person Probabilistic Computing project at MIT. They want to build computing systems that can be our partners, not our replacements. We have assumed that the measure of success of AI is that it beats us at our own game, e.g., AlphaGo, Deep Blue, Watson playing Jeopardy! But games have clearly measurable winners.

His lab is working on augmented intelligence that gives partial solutions, guidelines and hints that help us solve problems that neither system could solve on their own. The need for these systems are most obvious in large-scale human interest projects, e.g., epidemiology, economics, etc. E.g., should a successful nutrition program in SE Asia be tested in Africa too? There are many variables (including cost). BayesDB, developed by his lab, is “augmented intelligence for public interest data science.”

Traditional computer science, computing systems are built up from circuits to algorithms. Engineers can trade off performance for interpretability. Probabilisitic systems have some of the same considerations. [Sorry, I didn’t get that last point. My fault!]

John Palfrey is a former Exec. Dir. of BKC, chair of the Knight Foundation (a funder of this project) and many other things. Where can we, BKC and the Media Lab, be most effective as a research organization? First, we’ve had the most success when we merge theory and practice. And building things. And communicating. Second, we have not yet defined the research question sufficiently. “We’re close to something that clearly relates to AI, ethics and government” but we don’t yet have the well-defined research questions.

The Knight Foundation thinks this area is a big deal. AI could be a tool for the public good, but it also might not be. “We’re queasy” about it, as well as excited.

Nadya Peek is at the Media Lab and has been researching “macines that make machines.” She points to the first computer-controlled machine (“Teaching Power Tools to Run Themselves“) where the aim was precision. People controlled these CCMs: programmers, CAD/CAM folks, etc. That’s still the case but it looks different. Now the old jobs are being done by far fewer people. But the spaces between doesn’t always work so well. E.g., Apple can define an automatiable workflow for milling components, but if you’re student doing a one-off project, it can be very difficult to get all the integrations right. The student doesn’t much care about a repeatable workflow.

Who has access to an Apple-like infrastructure? How can we make precision-based one-offs easier to create? (She teaches a course at MIT called “How to create a machine that can create almost anything.”)

Nathan Mathias, MIT grad student with a newly-minted Ph.D. (congrats, Nathan!), and BKC community member, is facilitating the discussion. He asks how we conceptualize the range of questions that these talks have raised. And, what are the tools we need to create? What are the social processes behind that? How can we communicate what we want to machines and understand what they “think” they’re doing? Who can do what, where that raises questions about literacy, policy, and legal issues? Finally, how can we get to the questions we need to ask, how to answer them, and how to organize people, institutions, and automated systems? Scholarly inquiry, organizing people socially and politically, creating policies, etc.? How do we get there? How can we build AI systems that are “generative” in JZ’s sense: systems that we can all contribute to on relatively equal terms and share them with others.

Nathan: Vikash, what do you do when people disagree?

Vikash: When you include the sources, you can provide probabilistic responses.

Finale: When a system can’t provide a single answer, it ought to provide multiple answers. We need humans to give systems clear values. AI things are not moral, ethical things. That’s us.

Vikash: We’ve made great strides in systems that can deal with what may or may not be true, but not in terms of preference.

Nathan: An audience member wants to know what we have to do to prevent AI from repeating human bias.

Nadya: We need to include the people affected in the conversations about these systems. There are assumptions about the independence of values that just aren’t true.

Nathan: How can people not close to these systems be heard?

JP: Ethan Zuckerman, can you respond?

Ethan: One of my colleagues, Joy Buolamwini, is working on what she calls the Algorithmic Justice League, looking at computer vision algorithms that don’t work on people of color. In part this is because the tests use to train cv systems are 70% white male faces. So she’s generating new sets of facial data that we can retest on. Overall, it’d be good to use test data that represents the real world, and to make sure a representation of humanity is working on these systems. So here’s my question: We find co-design works well: bringing in the affected populations to talk with the system designers?

[Damn, I missed Yochai Benkler‘s comment.]

Finale: We should also enable people to interrogate AI when the results seem questionable or unfair. We need to be thinking about the proccesses for resolving such questions.

Nadya: It’s never “people” in general who are affected. It’s always particular people with agendas, from places and institutions, etc.

Comments Off on [liveblog] AI Advance opening: Jonathan Zittrain and lightning talks

May 11, 2017

[liveblog] St. Goodall

I’m in Rome at the National Geographic Science Festival
, co-produced by Codice Edizioni which, not entirely coincidentally published, the Italian version of my book Took Big to Know. Jane Goodall is giving the opening talk to a large audience full of students. I won’t try to capture what she is saying because she is talking without notes, telling her personal story.

She embodies an inquiring mind capable of radically re-framing our ideas simply by looking at the phenomena. We may want to dispute her anthropomorphizing of chimps but it is a truth that needed to be uncovered. For example, she says that when she got to Oxford to get a graduate degree — even though she had never been to college — she was told that she should’t have given the chimps names. But this, she says, was because at the time science believed humans were unique. Since then genetics has shown how close we are to them, but even before that her field work had shown the psychological and behavioral similarities. So, her re-framing was fecund and, yes, true.

At a conference in America in 1986, every report from Africa was about the decimation of the chimpanzee population and the abuse of chimpanzees in laboratories. “I went to this conference as a scientist, ready to continue my wonderful life, and I left as an activist.” Her Tacare Institute
works with and for Africans. For example, local people are equipped with tablets and phones and mark chimp nests, downed trees, and the occasional leopard. (Takari provides scholarships to keep girls in school, “and some boys too.”)

She makes a totally Dad joke about “the cloud.”

It is a dangerous world, she says. “Our intellects have developed tremendously.” “Isn’t it strange that this most intellectual creature ever is destroying its home.” She calls out the damage done to our climate by our farming of animals. “There are a lot of reasons to avoid eating a lot of meat or any, but that’s one of them.”

There is a disconnect between our beautiful brains and our hearts, she says. Violence, domestic violence, greed…”we don’t think ‘Are we having a happy life?'” She started “Roots and Shoots
” in 1991 in Tanzania, and now it’s in 99 countries, from kindergartens through universities. It’s a program for young people. “We do not tell the young people what to do.” They decide what matters to them.

Her reasons for hope: 1. The reaction to Roots and Shoots. 2. Our amazing brains. 3. The resilience of nature. 4. Social media, which, if used right can be a “tremendous tool for change.” 6. “The indomitable human spirit.” She uses Nelson Mandela as an example, but also refugees making lives in new lands.

“It’s not only humans that have an indomitable spirit.” She shows a brief video of the release of a chimp that left at least some wizened adults in tears:

She stresses making the right ethical choices, a phrase not heard often enough.

If in this audience of 500 students she has not made five new scientists, I’ll be surprised.

Comments Off on [liveblog] St. Goodall

May 7, 2017

Predicting the tides based on purposefully false models

Newton showed that the tides are produced by the gravitational pull of the moon and the Sun. But, as a 1914 article in Scientific American pointed out, if you want any degree of accuracy, you have to deal with the fact that “the earth is not a perfect sphere, it isn’t covered with water to a uniform­ form depth, it has many continents and islands and sea passages of peculiar shapes and depths, the earth does not travel about the sun in a circular path, and earth, sun and moon are not always in line. The result is that two tides are rarely the same for the same place twice running, and that tides dif­fer from each other enormously in both times and in amplitude.”

So, we instead built a machine of brass, steel and mahogany. And instead of trying to understand each of the variables, Lord Kelvin postulated “a very respectable number” of fictitious suns and moons in various positions over the earth, moving in unrealistically perfect circular orbits, to account for the known risings and fallings of the tide, averaging readings to remove unpredictable variations caused by weather and “freshets.” Knowing the outcomes, he would nudge a sun or moon’s position, or add a new sun or moon, in order to get the results to conform to what we know to be the actual tidal measurements. If adding sea serpents would have helped, presumably Lord Kelvin would have included them as well.

The first mechanical tide-predicting machines using these heuristics were made in England. In 1881, one was created in the United States that was used by the Coast and Geodetic Survey for twenty-seven years.

Then, in 1914, it was replaced by a 15,000-piece machine that took “account of thirty-seven factors or components of a tide” (I wish I knew what that means) and predicted the tide at any hour. It also printed out the information rather than requiring a human to transcribe it from dials. “Unlike the human brain, this one cannot make a mistake.”

This new model was more accurate, with greater temporal resolution. But it got that way by giving up on predicting the actual tide, which might vary because of the weather. We simply accept the unpredictability of what we shall for the moment call “reality.” That’s how we manage in a world governed by uniform laws operating on unpredictably complex systems.

It is also a model that uses the known major causes of average tides — the gravitational effects of the sun and moon — but that feels fine about fictionalizing the model until it provides realistic results. This makes the model incapable of being interrogated about the actual causes of the tide, although we can tinker with it to correct inaccuracies. In this there is a very rough analogy — and some disanalogies — with some instances of machine learning.

Comments Off on Predicting the tides based on purposefully false models