Joho the Blogai Archives - Joho the Blog

May 18, 2017

Indistinguishable from prejudice

“Any sufficiently advanced technology is indistinguishable from magic,” said Arthur C. Clarke famously.

It is also the case that any sufficiently advanced technology is indistinguishable from prejudice.

Especially if that technology is machine learning. ML creates algorithms to categorize stuff based upon data sets that we feed it. Say “These million messages are spam, and these million are not,” and ML will take a stab at figuring out what are the distinguishing characteristics of spam and not spam, perhaps assigning particular words particular weights as indicators, or finding relationships between particular IP addresses, times of day, lenghts of messages, etc.

Now complicate the data and the request, run this through an artificial neural network, and you have Deep Learning that will come up with models that may be beyond human understanding. Ask DL why it made a particular move in a game of Go or why it recommended increasing police patrols on the corner of Elm and Maple, and it may not be able to give an answer that human brains can comprehend.

We know from experience that machine learning can re-express human biases built into the data we feed it. Cathy O’Neill’s Weapons of Math Destruction contains plenty of evidence of this. We know it can happen not only inadvertently but subtly. With Deep Learning, we can be left entirely uncertain about whether and how this is happening. We can certainly adjust DL so that it gives fairer results when we can tell that it’s going astray, as when it only recommends white men for jobs or produces a freshman class with 1% African Americans. But when the results aren’t that measurable, we can be using results based on bias and not know it. For example, is anyone running the metrics on how many books by people of color Amazon recommends? And if we use DL to evaluate complex tax law changes, can we tell if it’s based on data that reflects racial prejudices?[1]

So this is not to say that we shouldn’t use machine learning or deep learning. That would remove hugely powerful tools. And of course we should and will do everything we can to keep our own prejudices from seeping into our machines’ algorithms. But it does mean that when we are dealing with literally inexplicable results, we may well not be able to tell if those results are based on biases.

In short: Any sufficiently advanced technology is indistinguishable from prejudice.[2]

[1] We may not care, if the result is a law that achieves the social goals we want, including equal and fair treatment of tax players regardless of race.

[2] Please note that that does not mean that advanced technology is prejudiced. We just may not be able to tell.

Be the first to comment »

May 15, 2017

[liveblog][AI] AI and education lightning talks

Sara Watson, a BKC affiliate and a technology critic, is moderating a discussion at the Berkman Klein/Media Lab AI Advance.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Karthik Dinakar at the Media Lab points out what we see in the night sky is in fact distorted by the way gravity bends light, which Einstein called a “gravity lens.” Same for AI: The distortion is often in the data itself. Karthik works on how to help researchers recognize that distortion. He gives an example of how to capture both cardiologist and patient lenses to better to diagnose women’s heart disease.

Chris Bavitz is the head of BKC’s Cyberlaw Clinic. To help Law students understand AI and tech, the Clinic encourages interdisciplinarity. They also help students think critically about the roles of the lawyer and the technologist. The clinic prefers early relationships among them, although thinking too hard about law early on can diminish innovation.

He points to two problems that represent two poles. First, IP and AI: running AI against protected data. Second, issues of fairness, rights, etc.

Leah Plunkett, is a professor at Univ. New Hampshire Law School and is a BKC affiliate. Her topic: How can we use AI to teach? She points out that if Tom Sawyer were real and alive today, he’d be arrested for what he does just in the first chapter. Yet we teach the book as a classic. We think we love a little mischief in our lives, but we apparently don’t like it in our kids. We kick them out of schools. E.g., of 49M students in public schools in 20-11, 3.45M were suspended, and 130,000 students were expelled. These disproportionately affect children from marginalized segments.

Get rid of the BS safety justification and the govt ought to be teaching all our children without exception. So, maybe have AI teach them?

Sarah: So, what can we do?

Chris: We’re thinking about how we can educate state attorneys general, for example.

Karthik: We are so far from getting users, experts, and machine learning folks together.

Leah: Some of it comes down to buy-in and translation across vocabularies and normative frameworks. It helps to build trust to make these translations better.

[I missed the QA from this point on.]

Be the first to comment »

[liveblog][AI] Perspectives on community and AI

Chelsea Barabas is moderating a set of lightning talks at the AI Advance, aat Berkman Klein and MIT Media Lab.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Lionel Brossi recounts growing up in Argentina and the assumption that all boys care about football. He moved to Chile which is split between people who do and do not watch football. “Humans are inherently biased.” So, our AI systems are likely to be biased. Cognitive science has shown that the participants in their studies tend to be WEIRD: western, educated, industrialized, rich and developed. Also straight and white. He references Kate Crawford‘s “AI’s White Guy Problem.” We need not only diverse teams of developers, but also to think about how data can be more representative. We also need to think about the users. One approach is work on goal centered design.

If we ever get to unbiased AI, Borges‘ statement, “The original is unfaithful to the translation” may apply.

Chelsea: What is an inclusive way to think of cross-border countries?

Lionel: We need to co-design with more people.

Madeline Elish is at Data and Society and an anthropology of technology grad student at Columbia. She’s met designers who thought it might be a good to make a phone run faster if you yell at it. But this would train children to yell at things. What’s the context in which such designers work? She and Tim Hwang set about to build bridges between academics and businesses. They asked what designers see as their responsibility for the social implications of their work. They found four core challenges:

1. Assuring users perceive good intentions
2. Protecting privacy
3. Long term adoption
4. Accuracy and reliability

She and Tim wrote An AI Pattern Language [pdf] about the frameworks that guide design. She notes that none of them were thinking about social justice. The book argues that there’s a way to translate between the social justice framework and, for example, the accuracy framework.

Ethan Zuckerman: How much of the language you’re seeing feels familiar from other hype cycles?

Madeline: Tim and I looked at the history of autopilot litigation to see what might happen with autonomous cars. We should be looking at Big Data as the prior hype cycle.

Yarden Katz is at the BKC and at the Dept. of Systems Biology at Harvard Medical School. He talks about the history of AI, starting with 1958 claim about translation machine. 1966: Minsky Then there was an AI funding winter, but now it’s big again. “Until recently, AI was a dirty word.”

Today we use it schizophrenically: for Deep Learning or in a totally diluted sense as something done by a computer. “AI” now seems to be a branding strategy used by Silicon Valley.

“AI’s history is diverse, messy, and philosophical.” If complexit is embraced, “AI” might not be a useful caregory for policy. So we should go basvk to the politics of technology:

1. who controls the code/frameworks/data
2. Is the system inspectable/open?
3. Who sets the metrics? Who benefits from them?

The media are not going to be the watchdogs because they’re caught up in the hype. So who will be?

Q: There’s a qualitative difference in the sort of tasks now being turned over to computers. We’re entrusting machines with tasks we used to only trust to humans with good judgment.

Yarden: We already do that with systems that are not labeled AI, like “risk assessment” programs used by insurance companies.

Madeline: Before AI got popular again, there were expert systems. We are reconfiguring our understanding, moving it from a cognition frame to a behavioral one.

Chelsea: I’ve been involved in co-design projects that have backfired. These projects have sometimes been somewhat extractive: going in, getting lots of data, etc. How do we do co-design that are not extractive but that also aren’t prohibitively expensive?

Nathan: To what degree does AI change the dimensions of questions about explanation, inspectability, etc.

Yarden: The promoters of the Deep Learning narrative want us to believe you just need to feed in lots and lots of data. DL is less inspectable than other methods. DL is not learning from nothing. There are open questions about their inductive power.


Amy Zhang and Ryan Budish give a pre-alpha demo of the AI Compass being built at BKC. It’s designed to help people find resources exploring topics related to the ethics and governance of AI.

Be the first to comment »

[liveblog] AI Advance opening: Jonathan Zittrain and lightning talks

I’m at a day-long conference/meet-up put on by the Berkman Klein Center‘s and MIT Media Lab‘s “AI for the Common Good” project.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Jonathan Zittrain gives an opening talk. Since we’re meeting at Harvard Law, JZ begins by recalling the origins of what has been called “cyber law,” which has roots here. Back then, the lawyers got to the topic first, and thought that they could just think their way to policy. We are now at another signal moment as we are in a frenzy of building new tech. This time we want instead to involve more groups and think this through. [I am wildly paraphrasing.]

JZ asks: What is it that we intuitively love about human judgment, and are we willing to insist on human judgments that are worse than what a machine would come up with? Suppose for utilitarian reasons we can cede autonomy to our machines — e.g., autonomous cars — shouldn’t we? And what do we do about maintaining local norms? E.g., “You are now entering Texas where your autonomous car will not brake for pedestrians.”

“Should I insist on being misjudged by a human judge because that’s somehow artesinal?” when, ex hypothesis, an AI system might be fairer.

Autonomous systems are not entirely new. They’re bringing to the fore questions that have always been with us. E.g., we grant a sense of discrete intelligence to corporations. E.g., “McDonald’s is upset and may want to sue someone.”

[This is a particularly bad representation of JZ’s talk. Not only is it wildly incomplete, but it misses the through-line and JZ’s wit. Sorry.]

Lightning Talks

Finale Doshi-Velez is particularly interested in interpretable machine learning (ML) models. E.g., suppose you have ten different classifiers that give equally predictive results. Should you provide the most understandable, all of them…?

Why is interpretability so “in vogue”? Suppose non-interpretable AI can do something better? In most cases we don’t know what “better” means. E.g., someone might want to control her glucose level, but perhaps also to control her weight, or other outcomes? Human physicians can still see things that are not coded into the model, and that will be the case for a long time. Also, we want systems that are fair. This means we want interpretable AI systems.

How do we formalize these notions of interpretability? How do we do so for science and beyond? E.g., what is a legal “right to explanation
” mean? She is working with Sam Greshman on how to more formally ground AI interpretability in the cognitive science of explanation.

Vikash Mansinghka leads the eight-person Probabilistic Computing project at MIT. They want to build computing systems that can be our partners, not our replacements. We have assumed that the measure of success of AI is that it beats us at our own game, e.g., AlphaGo, Deep Blue, Watson playing Jeopardy! But games have clearly measurable winners.

His lab is working on augmented intelligence that gives partial solutions, guidelines and hints that help us solve problems that neither system could solve on their own. The need for these systems are most obvious in large-scale human interest projects, e.g., epidemiology, economics, etc. E.g., should a successful nutrition program in SE Asia be tested in Africa too? There are many variables (including cost). BayesDB, developed by his lab, is “augmented intelligence for public interest data science.”

Traditional computer science, computing systems are built up from circuits to algorithms. Engineers can trade off performance for interpretability. Probabilisitic systems have some of the same considerations. [Sorry, I didn’t get that last point. My fault!]

John Palfrey is a former Exec. Dir. of BKC, chair of the Knight Foundation (a funder of this project) and many other things. Where can we, BKC and the Media Lab, be most effective as a research organization? First, we’ve had the most success when we merge theory and practice. And building things. And communicating. Second, we have not yet defined the research question sufficiently. “We’re close to something that clearly relates to AI, ethics and government” but we don’t yet have the well-defined research questions.

The Knight Foundation thinks this area is a big deal. AI could be a tool for the public good, but it also might not be. “We’re queasy” about it, as well as excited.

Nadya Peek is at the Media Lab and has been researching “macines that make machines.” She points to the first computer-controlled machine (“Teaching Power Tools to Run Themselves“) where the aim was precision. People controlled these CCMs: programmers, CAD/CAM folks, etc. That’s still the case but it looks different. Now the old jobs are being done by far fewer people. But the spaces between doesn’t always work so well. E.g., Apple can define an automatiable workflow for milling components, but if you’re student doing a one-off project, it can be very difficult to get all the integrations right. The student doesn’t much care about a repeatable workflow.

Who has access to an Apple-like infrastructure? How can we make precision-based one-offs easier to create? (She teaches a course at MIT called “How to create a machine that can create almost anything.”)

Nathan Mathias, MIT grad student with a newly-minted Ph.D. (congrats, Nathan!), and BKC community member, is facilitating the discussion. He asks how we conceptualize the range of questions that these talks have raised. And, what are the tools we need to create? What are the social processes behind that? How can we communicate what we want to machines and understand what they “think” they’re doing? Who can do what, where that raises questions about literacy, policy, and legal issues? Finally, how can we get to the questions we need to ask, how to answer them, and how to organize people, institutions, and automated systems? Scholarly inquiry, organizing people socially and politically, creating policies, etc.? How do we get there? How can we build AI systems that are “generative” in JZ’s sense: systems that we can all contribute to on relatively equal terms and share them with others.

Nathan: Vikash, what do you do when people disagree?

Vikash: When you include the sources, you can provide probabilistic responses.

Finale: When a system can’t provide a single answer, it ought to provide multiple answers. We need humans to give systems clear values. AI things are not moral, ethical things. That’s us.

Vikash: We’ve made great strides in systems that can deal with what may or may not be true, but not in terms of preference.

Nathan: An audience member wants to know what we have to do to prevent AI from repeating human bias.

Nadya: We need to include the people affected in the conversations about these systems. There are assumptions about the independence of values that just aren’t true.

Nathan: How can people not close to these systems be heard?

JP: Ethan Zuckerman, can you respond?

Ethan: One of my colleagues, Joy Buolamwini, is working on what she calls the Algorithmic Justice League, looking at computer vision algorithms that don’t work on people of color. In part this is because the tests use to train cv systems are 70% white male faces. So she’s generating new sets of facial data that we can retest on. Overall, it’d be good to use test data that represents the real world, and to make sure a representation of humanity is working on these systems. So here’s my question: We find co-design works well: bringing in the affected populations to talk with the system designers?

[Damn, I missed Yochai Benkler‘s comment.]

Finale: We should also enable people to interrogate AI when the results seem questionable or unfair. We need to be thinking about the proccesses for resolving such questions.

Nadya: It’s never “people” in general who are affected. It’s always particular people with agendas, from places and institutions, etc.

Be the first to comment »

April 19, 2017

Alien knowledge

Medium has published my long post about how our idea of knowledge is being rewritten, as machine learning is proving itself to be more accurate than we can be, in some situations, but achieves that accuracy by “thinking” in ways that we can’t follow.

This is from the opening section:

We are increasingly relying on machines that derive conclusions from models that they themselves have created, models that are often beyond human comprehension, models that “think” about the world differently than we do.

But this comes with a price. This infusion of alien intelligence is bringing into question the assumptions embedded in our long Western tradition. We thought knowledge was about finding the order hidden in the chaos. We thought it was about simplifying the world. It looks like we were wrong. Knowing the world may require giving up on understanding it.

2 Comments »

December 7, 2014

[2b2k] Agre on minds and hands

I recently published a column at KMWorld pointing out some of the benefits of having one’s thoughts share a context with people who build things. Today I came across an article by Jethro Masis titled “Making AI Philosophical Again: On Philip E. Agre’s Legacy.” Jethro points to a 1997 work by the greatly missed Philip Agre that says it so much better:

…what truly founds computational work is the practitioner’s evolving sense of what can be built and what cannot” (1997, p. 11). The motto of computational practitioners is simple: if you cannot build it, you do not understand it. It must be built and we must accordingly understand the constituting mechanisms underlying its workings.This is why, on Agre’s account, computer scientists “mistrust anything unless they can nail down all four corners of it; they would, by and large, rather get it precise and wrong than vague and right” (Computation and Human Experience, 1997, p. 13).

(I’m pretty sure I read Computation and Human Experience many years ago. Ah, the Great Forgetting of one in his mid-60s.)

Jethro’s article overall attempts to adopt Agre’s point that “The technical and critical modes of research should come together in this newly expanded form of critical technical consciousness,” and to apply this to Heidegger’s idea of Zuhandenheit: how things show themselves to us as useful to our plans and projects; for Heidegger, that is the normal, everyday way most things present themselves to us. This leads Jethro to take us through Agre’s criticisms of AI modeling, its failure to represent context except as vorhanden [pdf], (Heidegger’s term for how things look when they are torn out of the context of our lived purposes), and the need to thoroughly rethink the idea of consciousness as consisting of representations of an external world. Agre wants to work out “on a technical level” how this can apply to AI. Fascinating.


Here’s another bit of brilliance from Agre:

For Agre, this is particularly problematic because “as long as an underlying metaphor system goes unrecognized, all manifestations of trouble in technical work will be interpreted as technical difficulties and not as symptoms of a deeper, substantive problem.” (p. 260 of CHE)

4 Comments »

June 8, 2014

Will a Google car sacrifice you for the sake of the many? (And Networked Road Neutrality)

Google self-driving cars are presumably programmed to protect their passengers. So, when a traffic situation gets nasty, the car you’re in will take all the defensive actions it can to keep you safe.

But what will robot cars be programmed to do when there’s lots of them on the roads, and they’re networked with one another?

We know what we as individuals would like. My car should take as its Prime Directive: “Prevent my passengers from coming to harm.” But when the cars are networked, their Prime Directive well might be: “Minimize the amount of harm to humans overall.” And such a directive can lead a particular car to sacrifice its humans in order to keep the total carnage down. Asimov’s Three Rules of Robotics don’t provide enough guidance when the robots are in constant and instantaneous contact and have fragile human beings inside of them.

It’s easy to imagine cases. For example, a human unexpectedly darts into a busy street. The self-driving cars around it rapidly communicate and algorithmically devise a plan that saves the pedestrian at the price of causing two cars to engage in a Force 1 fender-bender and three cars to endure Force 2 minor collisions…but only if the car I happen to be in intentionally drives itself into a concrete piling, with a 95% chance of killing me. All other plans result in worse outcomes, where “worse” refers to some scale that weighs monetary damages, human injuries, and human deaths.

Or, a broken run-off pipe creates a dangerous pool of water on the highway during a flash storm. The self-driving cars agree that unless my car accelerates and rams into a concrete piling, all other joint action results in a tractor trailing jack-knifing, causing lots of death and destruction. Not to mention The Angelic Children’s Choir school bus that would be in harm’s way. So, the swarm of robotic cars makes the right decision and intentionally kills me.

In short, the networking of robotic cars will change the basic moral principles that guide their behavior. Non-networked cars are presumably programmed to be morally-blind individualists trying to save their passengers without thinking about others, but networked cars will probably be programmed to support some form of utilitarianism that tries to minimize the collective damage. And that’s probably what we’d want. Isn’t it?

But one of the problems with utilitarianism is that there turns out to be little agreement about what counts as a value and how much it counts. Is saving a pedestrian more important than saving a passenger? Is it always right try to preserve human life, no matter how unlikely it is that the action will succeed and no matter how many other injuries it is likely to result in? Should the car act as if its passenger has seat-belted him/herself in because passengers should do so? Should the cars be more willing to sacrifice the geriatric than the young, on the grounds that the young have more of a lifespan to lose? And won’t someone please think about the kids m— those cute choir kids?

We’re not good at making these decisions, or even at having rational conversations about them. Usually we don’t have to, or so we tell ourselves. For example, many of the rules that apply to us in public spaces, including roads, optimize for fairness: everyone waits at the same stop lights, and you don’t get to speed unless something is relevantly different about your trip: you are chasing a bad guy or are driving someone who urgently needs medical care.

But when we are better able control the circumstances, fairness isn’t always the best rule, especially in times of distress. Unfortunately, we don’t have a lot of consensus around the values that would enable us to make joint decisions. We fall back to fairness, or pretend that we can have it all. Or we leave it to experts, as with the rules that determine who gets organ transplants. It turns out we don’t even agree about whether it’s morally right to risk soldiers’ lives to rescue a captured comrade.

Fortunately, we don’t have to make these hard moral decisions. The people programming our robot cars will do it for us.

 


Imagine a time when the roadways are full of self-driving cars and trucks. There are some good reasons to think that that time is coming, and coming way sooner than we’d imagined.

Imagine that Google remains in the lead, and the bulk of the cars carry their brand. And assume that these cars are in networked communication with one another.

Can we assume that Google will support Networked Road Neutrality, so that all cars are subject to the same rules, and there is no discrimination based on contents, origin, destination, or purpose of the trip?

Or would Google let you pay a premium to take the “fast lane”? (For reasons of network optimization the fast lane probably wouldn’t actually be a designated lane but well might look much more like how frequencies are dynamically assigned in an age of “smart radios.”) We presumably would be ok with letting emergency vehicles go faster than the rest of the swarm, but how about letting the rich go faster by programming the robot cars to give way when a car with its “Move aside!” bit is on?

Let’s say Google supports a strict version of Networked Road Neutrality. But let’s assume that Google won’t be the only player in this field. Suppose Comcast starts to make cars, and programs them to get ahead of the cars that choose to play by the rules. Would Google cars take action to block the Comcast cars from switching lanes to gain a speed advantage — perhaps forming a cordon around them? Would that be legal? Would selling a virtual fast lane on a public roadway be legal in the first place? And who gets to decide? The FCC?

One thing is sure: It’ll be a golden age for lobbyists.

5 Comments »

April 12, 2014

[2b2k] Protein Folding, 30 years ago

Simply in terms of nostalgia, this 1985 video called “Knowledge Engineering: Artificial Intelligence Research at the Stanford Heuristic Programming Project” from the Stanford archives is charming right down to its Tron-like digital soundtrack.

But it’s also really interesting if you care about the way we’ve thought about knowledge. The Stanford Heuristic Programming Project under Edward Feigenbaum did groundbreaking work in how computers represent knowledge, emphasizing the content and not just the rules. (Here is a 1980 article about the Project and its projects.)

And then at the 8:50 mark, it expresses optimism that an expert system would be able to represent not only every atom of proteins but how they fold.

Little could it have been predicted that protein folding even 30 years later would be better recognized by the human brain than by computers, and that humans playing a game — Fold.It — would produce useful results.

It’s certainly the case that we have expert systems all over the place now, from Google Maps to the Nest thermostat. But we also see another type of expert system that was essentially unpredictable in 1985. One might think that the domain of computer programming would be susceptible to being represented in an expert system because it is governed by a finite set of perfectly knowable rules, unlike the fields the Stanford project was investigating. And there are of course expert systems for programming. But where do the experts actually go when they have a problem? To StackOverflow where other human beings can make suggestions and iterate on their solutions. One could argue that at this point StackOverflow is the most successful “expert system” for computer programming in that it is the computer-based place most likely to give you an answer to a question. But it does not look much like what the Stanford project had in mind, for how could even Edward Feigenbaum have predicted what human beings can and would do if connected at scale?

(Here’s an excellent interview with Feigenbaum.)

Be the first to comment »

February 16, 2011

Watson, AI, Jeopardy, Toronto

Watson’s Final Jeopardy answer has created a need for a new word:

Schadenrobot: The joy humans feel watching robots fail.

[Minutes later:] I tweeted this, and Alistair Steele [twitter: alistairsteele immediately riposted with the far cleverer term: Schadendroid. Well played, suh!

2 Comments »