Joho the Blogfairness Archives - Joho the Blog

September 14, 2018

Five types of AI fairness

Google PAIR (People + AI Research) has just posted my attempt to explain what fairness looks like when it’s operationalized for a machine learning system. It’s pegged around five “fairness buttons” on the new Google What-If tool, a resource for developers who want to try to figure out what factors (“features” in machine learning talk) are affecting an outcome.

Note that there are far more than five ways to operationalize fairness. The point of the article is that once we are forced to decide exactly what we’re going to count as fair — exactly enough that a machine learning system can implement it — we realize just how freaking complex fairness is. OMG. I broke my brain trying to figure out how to explain some of those ideas, and it took several Google developers (especially James Wexler) and a fine mist of vegetarian broth to restore it even incompletely. Even so, my explanations are less clear than I (or you, I’m sure) would like. But at least there’s no math in them :)

I’ll write more about this at some point, but for me the big take-away is that fairness has had value as a moral concept so far because it is vague enough to allow our intuition to guide us. Machine learning is going to force us to get very specific about it. But we are not yet adept enough at it — e.g., we don’t have a vocabulary for talking about the various varieties — plus we don’t agree about them enough to be able to navigate the shoals. It’s going to be a big mess, but something we have to work through. When we do, we’ll be better at being fair.

Now, about the fact that I am a writer-in-residence at Google. Well, yes I am, and have been for about 6 weeks. It’s a 6 month part-time experiment. My role is to try to explain some of machine learning to people who, like me, lack the technical competence to actually understand it. I’m also supposed to be reflecting in public on what the implications of machine learning might be on our ideas. I am expected to be an independent voice, an outsider on the inside.

So far, it’s been an amazing experience. I’m attached to PAIR, which has developers working on very interesting projects. They are, of course, super-smart, but they have not yet tired of me asking dumb questions that do not seem to be getting smarter over time. So, selfishly, it’s been great for me. And isn’t that all that really matters, hmmm?

Comments Off on Five types of AI fairness

April 27, 2018

[liveblog][ai] Ben Green: The Limits of "Fair" Algorithms

Ben Green is giving a ThursdAI talk on “The Limits, Perils, and Challenges of ‘Fair’ Algorithms for Criminal Justice Reform.”

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

In 2016, the COMPAS algorithm
became a household name (in some households) when ProPublica showed that it predicted that black men were twice as likely as white men to jump bail. People justifiably got worried that algorithms can be highly biased. At the same time, we think that algorithms may be smarter than humans, Ben says. These have been the poles of the discussion. Optimists think that we can limit the bias to take advantage of the added smartness.

There have been movements to go toward risk assessments for bail, rather than using money bail. E.g., Rand Paul and Kamala Harris have introduced the Pretrial Integrity and Safety Act of 2017. There have also been movements to use scores only to reduce risk assessments, not to increase them.

But are we asking the right questions? Yes, the criminal justice system would be better if judges could make more accurate and unbiased predictions, but it’s not clear that machine learning can do this. So, two questions: 1. Is ML an appropriate tool for this. 2. Is implementing MK algorithms an effective strategy for criminal justice reform?

#1 Is ML and appropriate tool to help judges make more accurate and unbiased predictions?

ML relies on data about the world. This can produce tunnel vision by causing us to focus on particular variables that we have quantified, and ignore others. E.g., when it comes to sentencing, a judge balances deterrence, rehabilitation, retribution, and incapacitating a criminal. COMPAS predicts recidivism, but none of the other factors. This emphasizes incapacitation as the goal of sentencing. This might be good or bad, but the ML has shifted the balance of factors, framing the decision without policy review or public discussion.

Q: Is this for sentencing or bail? Because incapacitation is a more important goal in sentencing than in bail.

A: This is about sentencing. I’ll be referring to both.

Data is always about the past, Ben continues. ML finds statistical correlations among inputs and outputs. It applies those correlations to the new inputs. This assumes that those correlations will hold in the future; it assumes that the future will look like the past. But if we’re trying reform the judicial system, we don’t want the future to look like the past. ML can thus entrench historical discrimination.

Arguments about the fairness of COMPAS are often based on competing mathematical definitions of fairness. But we could also think about the scope of what we couint as fair. ML tries to make a very specific decision: among a population, who recidivates? If you take a step back and consider the broader context of the data and the people, you would recognize that blacks recidivate at a higher rate than whites because of policing practices, economic factors, racism, etc. Without these considerations, you’re throwing away the context and accepting the current correlations as the ground truth. Even if we were to change the base data, the algorithm wouldn’t make the change, unless you retrain it.

Q: Who retrains the data?

A: It depends on the contract the court system has.

Algorithms are not themselves a natural outcome of the world. Subjective decisions go into making them: which data to input, choosing what to predict, etc. The algorithms are brought into court as if they were facts. Their subjectivity is out of the frame. A human expert would be subject to cross examination. We should be thinking of algorithms that way. Cross examination might include asking how accurate the system is for the particular group the defendant is in, etc.

Q: These tools are used in setting bail or a sentence, i.e., before or after a trial. There may not be a venue for cross examination.

In the Loomis case, an expert witness testified that the algorithm was misused. That’s not exactly what I’m suggesting; they couldn’t get to all of it because of the trade secrecy of the algorithms.

Back to the framing question. If you can make the individual decision points fair we sometimes think we’ve made the system fair. But technocratic solutions tend to sanitize rather than alter. You’re conceding the overall framework of the system, overlooking more meaningful changes. E.g., in NY, 71% of voters support ending pre-trial jail for misdemeanors and non-violent felonies. Maybe we should consider that. Or, consider that cutting food stamps has been shown to increases recidivism. Or perhaps we should be reconsidering the wisdom of preventative detention, which was only introduced in the 1980s. Focusing on the tech de-focuses on these sorts of reforms.

Also, technocratic reforms are subject to political capture. E.g., NJ replaced money bail with a risk assessment tool. After some of the people released committed crimes, they changed the tool so that certain crimes were removed from bail. What is an acceptable risk level? How to set the number? Once it’s set, how is it changed?

Q: [me] So, is your idea that these ML tools drive out meaningful change, so we ought not to use them?

A: Roughly, yes.

[Much interesting discussion which I have not captured. E.g., Algorithms can take away the political impetus to restore bail as simply a method to prevent flight. But sentencing software is different, and better algorithms might help, especially if the algorithms are recommending sentences but not imposing them. And much more.]

2. Do algorithms actually help?

How do judges use algorithms to make a decision? Even if the algorithm were perfect, would it improve the decisions judges make? We don’t have much of an empirical answer.

Ben was talking to Jeremy Heffner at Hunch Lab. They make predictive policing software and are well aware of the problem of bias. (“If theres any bias in the system it’s because of the crime data. That’s what we’re trying to address.” — Heffner) But all of the suggestions they give to police officers are called “missions,” which is in the military/jeopardy frame.

People are bad at incorporating quantitative data into decisions. And they filter info through their biases. E.g., the “ban the box” campaign to remove the tick box about criminal backgrounds on job applications actually increased racial discrimination because employers assumed the white applicants were less likely to have arrest records. (Agan and Starr 2016) Also, people have been shown to interpret police camera footage according to their own prior opinions about the police. (sommers 2016)

Evidence from Kentucky (Stevenson 2018): after mandatory risk assessments for bail only made a small increase in pretrial release, and these changes eroded over time as judges returned to their previous habits.

So, we need to be asking the empirical question of how judges actual use these decisions. And should judges incorporate these predictions into their decisions?

Ben’s been looking at the first question:L how do judges use algorithmic predictions? He’s running experiments on Mechanical Turk showing people profiles of defendants — a couple of sentences about the crime, race, previous record arrest record. The Turkers have to give a prediction of recidivism. Ben knows which ones actually recidivated. Some are also given a recommendation based on an algorithmic assessment. That risk score might be the actual one, random, or biased; the Turkers don’t know that about the score.

Q: It might be different if you gave this test to judges.

A: Yes, that’s a limitation.

Q: You ought to give some a percentage of something unrelated, e.g., it will rain, just to see if the number is anchoring people.

A: Good idea

Q: [me] Suppose you find that the Turkers’ assessment of risk is more racially biased than the algorithm…

A: Could be.

[More discussion until we ran out of time. Very interesting.]

3 Comments »

April 5, 2018

[liveblog] Neil Gaikwad Human-AI Collaboration for Sustainable Market Design

I’m at a ThursdAI talk (Harvard’s Berkman Klein Center for Internet & Society and MIT Media Lab) being given by Neil Gaikwad (Twitter: @neilthemathguy, a Ph.D. at the MediaLab, in the Space Enabled Group.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Markets and institutions are parts of complex ecosystem, Neil says. His research looks at data from satellites that show how the Earth is changing: crops, water, etc. Once you’ve gathered the data, you can use machine learning to visualize the changes. There are ecosystems, including of human behavior, that are affected by this. It affects markets and institutions. E.g., a drought may require an institutional response, and affect markets.

Traditional markets, financial markets, and gig economies all share characteristics. Farmers markets are complex ecosystems of people with differing information and different amounts of it, i.e. asymmetric info. Same for financial markets. Same for gig economies.

Indian markets have been failing; there have been 300,000 suicides in the last 30 years. Stock markets have crashed suddenly due to blackbox marketing; in some cases we still don’t know why. And London has banned Uber. So, it doesn’t matter which markets or institutions we look at, they’re losing our trust.

An article in New Scientist asked what we can do to regain this trust. For black box AI, there are questions of fairness and equity. But what would human-machine collaboration be like? Are there design principles for markets.?

Neil stops for us to discuss.

Q: How do you define the justice?

A: Good question. Fairness? Freedom? The designer has a choice about how to define it.

Q: A UN project created an IT platform that put together farmers and direct consumers. The pricing seemed fairer to both parties. So, maybe avoid intermediaries, as a design principle?

Neil continues. So, what is the concept of justice here?

1. Rawls and Kant: Transcendental institutionalism. It’s deontological: follow a principle for perfect justice. Use those principles to define a perfect institution. The properties are defined by a social contract. But it doesn’t work, as in the examples we just saw. What is missing. People and society. [I.e., you run the institution according to principles, but that doesn’t guarantee that the outcome will be fair and just. My example: Early Web enthusiasts like me thought the Web was an institution built on openness, equality, creative anarchy, etc., yet that obviously doesn’t ensure that the outcome will share those properties.]

2. Realized-focused institutionalism (Sen
2009): How to reverse this trend. It is consequentialist: what will be the consequences of the design of an institution. It’s a comparative assessment of different forms of institutions. Instead of asking for the perfectly justice society, Sen asks how justice can be advanced. The most critical tool for evaluating any institution is to look at how it actually realizes how people’s lives change.

Sen argues that principles are important. They can be expressed by “niti,” Sanskrit for rules and institutions. But you also need nyaya: a form of social arrangement that makes sure that those rules are obeyed. These rules come from social choice, not social contract.

Example: Gig economies. The data comes from mechanical turk, upwork, crowdflower, etc. This creates employment for many people, but it’s tough. E.g., identifying images. Use supervised learning for this. The Turkers, etc., do the labelling to train the image recognition system. The Turkers make almost no money at this. This is the wicked problem of market design: The worker can have identifications rejected, sometimes with demeaning comments.

The Market for Lemons” (Akerlog, et al., 1970): all the cars started to look alike and now all gig-workers look alike to those who hire them: there’s no value given to bringing one’s value to the labor.

So, who owns the data? Who has a stake in the models? In the intellectual property?

If you’re a gig worker, you’re working with strangers. You don’t know the reputation of the person giving me data. Or renting me the Airbnb apartment. So, let’s put a rule: reputation is the backbone. In sharing economies, most of the ratings are the highest. Reputation inflation. So, can we trust reputation? This happens because people have no incentive to rate. There’s social pressure to give a positive rating.

So, thinking about Sen, can we think about an incentive for honest reputation? Neil’s group has been thinking about a system [I thought he said Boomerang, but I can’t find that]. It looks at the workers’ incentives. It looks at the workers’ ratings of each other. If you’re a requester, you’ll see the workers you like first.

Does this help AI design?

MoralMachine has had 1.3M voters and 18M pairwise comparisons (i.e., people deciding to go straight or right). Can this be used as a voting based system for ethical decision making (AAAI 2018)? You collect the pairwise preferences, learn the model of preference, come to a collective preference, and have voting rules for collective decision.

Q: Aren’t you collect preferences, not normative judgments? The data says people would rather kill fat people than skinny ones.

A: You need the social behavior but also rules. For this you have to bring people into the loop.

Q: How do we differentiate between what we say we want and what we really want?

A: There are techniques, such as “Bayesian Truth Serum”nomics.mit.edu/files/1966”>Bayesian Truth Serum.

Conclusion: The success of markets, institutions or algorithms, is highly dependent on how this actually affects people’s lives. This thinking should be central to the design and engineering of socio-technical systems.

1 Comment »

December 2, 2017

[liveblog] Doaa Abu-Elyounes on "Bail or Jail? Judicial vs. Algorithmic decision making"

I’m at a weekly AI talk put on by Harvard’s Berkman Klein Center for Internet & Society and the MIT Media Lab. Doaa Abu-Elyounes is giving a talk called “Bail or Jail? Judicial vs. Algorithmic decision making”.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Doaa tells us that this talk is a work in progress.

We’ve all heard now about AI-based algorithms that are being used to do risk assessments in pretrial bail decisions. She thinks this is a good place to start using algorithms, although it’s not easy.

The pre-trial stage is supposed to be very short. The court has to determine if the defendant, presumed innocent, will be released on bail or jailed. The sole considerations are supposed to be whether the def is likely to harm someone else or flee. Preventive detention has many efffects, mostly negative for the defendant.
(The US is a world leader in pre-trial detainees. Yay?)

Risk assessment tools have been used for more than 50 years. Actuarial tools have shown greater predictive power than clinical judgment, and can eliminate some of the discretionary powers of judges. Use of these tools have long been controversy What type of factors to include in the power? Is the use of demographic factors to make predictions fair to individuals?

Existing tools use regression analysis. Now machine learning can learn from much more data. Mechanical predictions [= machine learning] are more accurate than statistical predictions, but may not be explicable.

We think humans can explain their decisions and we want machines to be able to as well. But look at movie reviews. Humans can tell if a review is positive. We can teach which words are positive or negative, getting 60% accuracy. Or we can have a human label the reviews as positive or negative and let the machine figure out what the factor are — via machine leaning — in which case we get 80% accuracy but may lose explicability.

With pretrial situations, what is the automated task is that the machine should be performing?

There’s a tension between accuracy and fairness. Computer scientists are trying to quantify these questions What does a fair algorithm look like? John Kleinberg and colleagues did a study of this [this one?]. Their algorithms reduced violent crime by 25% with no change in jailing rates, without increasing racial disparities. In short, the algorithm seems to have done a more accurate job with less bias.

Doaa lists four assessment tools she will be looking at: the Pretrial Risk Assessment [this one?], the Public Safety Assessment, the Virginia Pretrial Risk assessment Instrument and the Colorado Pretrial Assessment Tool.

Doaa goes through questions that should be asked of these tools, beginning with: Which factors are considered in each? [She dives into the details for all four tools. I can’t capture it. Sorry.]

What are the sources of data? (3 out of 4 rely on interviews and databases.)

What is the quality of the data? “This is the biggest problem jurisdictions are dealing with when using such a tool.” “Criminal justice data is notoriously poor.” And, of course, if a machine learning system is trained on discriminatory data, its conclusions are likely to reflect those biases.

The tools neeed to be periodically validated using data from its own district’s population. Local data matters.

There should be separate scores for flight risk and public safety All but the PSA provide only a single score. This is important because there are separate remedies for the two concerns. E.g., you might want to lock up someone who is a risk to public safety, but take away the passport of someone who is a flight risk.

Finally, the systems should discriminate among reasons for flight risk. E.g., because the defendant can’t afford the cost of making it to court or because she’s fleeing?

Conclusion: Pretrial is the front door of the criminal justice system and affects what happens thereafter. Risk assessment tools should not replace judges, but they bring benefits. They should be used, and should be made as transparent as possible. There are trade offs. The tool will not eliminate all bias but might help reduce it.

Q&A

Q: Do the algorithms recognize the different situations of different defendants?

A: Systems do recognize this, but not in sophisticated ways. That’s why it’s important to understand why a defendant might be at risk of missing a court date. Maybe we could provide poor defendants with a Metro card.

Q: Could machine learning be used to help us be more specific in the types of harm? What legal theories might we drawn on to help with this?

A: [The discussion got too detailed for me to follow. Sorry.]

Q: There are different definitions of recidivism. What do we do when there’s a mismatch between the machines and the court?

A: Some states give different weights to different factors based on how long ago the prior crimes were committed. I haven’t seen any difference in considering how far ahead the risk of a possible next crime is.

Q: [me] While I’m very sympathetic to allowing machine learning to be used without always requiring that the output be explicable, when it comes to the justice system, do we need explanations so not only is justice done, but we can have trust that it’s being done?

A: If we can say which factors are going into a decision — and it’s not a lot of them — if the accuracy rate is much higher than manual systems, then maybe we can give up on always being able to explain exactly how it came to its decisions. Remember, pre-trial procedures are short and there’s usually not a lot of explaining going on anyway. It’s unlikely that defendants are going to argue over the factors used.

Q: [me] Yes, but what about the defendant who feels that she’s being treated differently than some other person and wants to know why?

A: Judges generally don’t explain how they came to their decisions anyway. The law sets some general rules, and the comparisons between individuals is generally within the framework of those rules. The rules don’t promise to produce perfectly comparable results. In fact, you probably can’t easily find two people with such similar circumstances. There are no identical cases.

Q: Machine learning, multilevel regression level, and human decision making all weigh data and produce an outcome. But ML has little human interaction, statistical analysis has some, and the human decision is all human. Yet all are in fact algorithmic: the judge looks at a bond schedule to set bail. Predictability as fairness is exacerbated by the human decisions since the human cannot explain her model.

Q: Did you find any logic about why jurisdictions picked which tool? Any clear process for this?

A: It’s hard to get that information about the procurement process. Usually they use consultants and experts. There’s no study I know of that looks at this.

Q: In NZ, the main tool used for risk assessment for domestic violence is a Canadian tool called ODARA. Do tools work across jurisdictions? How do you reconcile data sets that might be quite different?

A: I’m not against using the same system across jurisdictions — it’s very expensive to develop one from scratch — but they need to be validated. The federal tool has not been, as far as I know. (It was created in 2009.) Some tools do better at this than others.

Q: What advice would you give to a jurisdiction that might want to procure one? What choices did the tools make in terms of what they’re optimized for? Also: What about COMPAS?

A: (I didn’t talk about COMPAS because it’s notorious and not often used in pre-trial, although it started out as a pre-trial tool.) The trade off seems to be between accuracy and fairness. Policy makers should define more strictly where the line should be drawn.

Q: Who builds these products?

A: Three out of the four were built in house.

Q: PSA was developed by a consultant hired by the Arnold Foundation. (She’s from Luminosity.) She has helped develop a number of the tools.

Q: Why did you decide to research this? What’s next?

A: I started here because pre-trial is the beginning of the process. I’m interested in the fairness question, among other things.

Q: To what extent are the 100+ factors that the Colorado tool considers available publicly? Is their rationale for excluding factors public? Because they’re proxies for race? Because they’re hard to get? Or because back then 100+ seemed like too many? And what’s the overlap in factors between the existing systems and the system Kleinberg used?

A: Interviewing defendants takes time, so 100 factors can be too much. Kleinberg only looked at three factors. Another tool relied on six factors.

Q: Should we require private companies to reveal their algorithms?

A: There are various models. One is to create an FDA for algorithms. I’m not sure I support that model. I think private companies need to expose at least to the govt the factors that they’re including. Others would say I’m too optimistic about the government.

Q: In China we don’t have the pre-trial part, but there’s an article saying that they can make the sentencing more fair by distinguishing among crimes. Also, in China the system is more uniform so the data can be aggregated and the system can be made more accurate.

A: Yes, states are different because they have different laws. Exchanging data between states is not very common and may not even be possible.

Comments Off on [liveblog] Doaa Abu-Elyounes on "Bail or Jail? Judicial vs. Algorithmic decision making"

November 19, 2017

[liveblog][ai] A Harm-reduction framework for algorithmic accountability

I’m at one of the weekly Harvard’s Berkman Klein Center for Internet & Society and MIT Media Lab talks. Alexandra Wood and Micah Altman are talking about “A harm reduction framework for algorithmic accountability over personal information” — a snapshot of their ongoing research at the Privacy Tools Project. The PTP is an interdisciplinary project that investigates tools for sharing info while preserving privacy.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Alexandra says they’ve been developing frameworks for assessing privacy risk when collecting personal data, and have been looking at the controls that can be used to protect individuals. They’ve found that privacy tools address a narrow slice of the problem; other types of misuse of data require other approaches.

She refers to some risk assessment algorithms used by the courts that have turned out to be racially biased, to have unbalanced error rates (falsely flagging black defendants as future criminals at twice the rate as white defendents), and are highly inaccurate. “What’s missing is an analysis of harm”What’s missing is an analysis of harm. “Current approaches to regulating algorithmic classification and decision-making largely elide harm,” she says. “The ethical norms in the law point to the broader responsibilities of the algorithms’ designers.”

Micah says there isn’t a lot of work mapping the loss privacy to other harms. The project is taking an interdisciplinary approach to this. [I don’t trust my blogging here. Don’t get upset with Micah for my ignorance-based reporting!]

Social science treats harm pragmatically. It finds four main dimensions of well-being: wealth, health, life satisfaction and the meaningful choices available to people. Different schools take different approaches to this, some emphasizing physical and psychological health, others life satisfaction, etc.

But to assess harm, you need to look at people’s lives over time. E.g., how does going to prison affect people’s lives? Being sentenced decreases your health, life-satisfaction, choices, and loer income. “The consequences of sentencing are persistent and individually catastrophic.”

He shows a chart from ProPublica based on Broward County data that shows that the risk scores for white defendants skews heavily toward lower scores while the scores for black defendants is more evenly distributed. This by itself doesn’t prove that it’s unfair. You have to look at the causes of the differences in those distributions.

Modern inference theory says something different about harm. A choice is harmful if the causal effect of that outcome is worse, and the causal effect is measured by potential outcomes. “The causal impact of smoking is not simply that you may get cancer, but includes the impact of not smoking”The causal impact of smoking is not simply that you may get cancer, but includes the impact of not smoking, such as possibly gaining weight. You have to look at the counter-factuals

The COMPAS risk assessment tool that has been the subject of much criticism is affected by which training data you use, choice of algorithm, the application of it to the individual, and the use of the score in sentencing. Should you exclude information about race? Or exclude any info that might violate people’s privacy? Or make it open or not? And how to use the outcomes?

Can various protections reduce harm from COMPAS? Racial features were not explicitly included in the COMPAS model. But there are proxies for race. Removing the proxies could lead to less accurate predictions, and make it difficult to study and correct for bias. That is, removing that data (features) doesn’t help that much and might prevent you from applying corrective measures.

Suppose you throw out the risk score. Judges are still biased. “The adverse impact is potentially greater when the decision is not informed by an algorithm’s prediction.” A recent paper by John Kleinberg showed that “algorithms predicting pre-trial assessments were less biased than decisions made by human judges”algorithms predicting pre-trial assessments were less biased than decisions made by human judges. [I hope I got that right. It sounds like a significant finding.]

There’s another step: going from the outcomes to the burdens these outcomes put on people. “An even distribution of outcomes can produce disproportionate burdens.” E.g. juvenile defendants have more to lose — more of their years will be negatively affected by a jail sentence — so having the same false positive and negatives for adults and juveniles would impost a greater burden on the juveniles. When deciding it an algorithmic decision is unjust, you can’t just look at the equality of error rates.

A decision is unjust when it is: 1. Dominated (all groups pay a higher burden for the same social benefit); 2. Unprogressive (higher relative burdens on members of classes who are less well off); 3. Individually catastrophic (wrong decisions are so harmful that it reduces the well being of individuals in members of a known class); 4) Group punishment (an effect on an entire disadvantaged class.)

For every decision, theere are unavoidable constraints: a tradeoff between the individual and the social group; a privacy cost; can’t be equally accurate in all categories; can’t be fair without comparing utility across people; it’s impossible to avoid constraints by adding human judgment because the human is still governed by these constraints.

Micah’s summary for COMPAS: 1. Some protections would be irrelevant (inclusion of sensitive characteristics and and protection of indvidual information). Other protections would be insufficient (no intention to discriminate, open source/open data/FCRA).

Micah ends with a key question about fairness that has been too neglected: “Do black defendants bear a relatively higher cost than whites from bad decisions that prevent the same social harms?”

Comments Off on [liveblog][ai] A Harm-reduction framework for algorithmic accountability

April 6, 2015

Culture is unfair

At Jonathan Zittrain‘s awesome lecture upon the occasion of his ascending to the Bemis Chair at Harvard Law (although shouldn’t you really descend into a chair?), he made the point that through devices like Microsoft Kinect, our TVs are on the verge of knowing how many people are in the room watching. After all, your camera (= phone) already can identify the faces in a photo.

This will inevitably lead to the claim that if five people are watching a for-pay movie on a TV, we ought to be paying 5x what a single person does. After all, it’s delivering five times the value. What are you, a bunch of pirates?

There is some fairness to that claim. We’d pay for five tickets if we saw it in a theater.

But it also feels wrong. Very wrong. And not just because it costs us more.

For example, I’m told that if you buy a subscription to the NY Times it comes with one license for online access. So, if you’re having the old roll o’ stories thrown onto your porch every morning, your spouse is free to read it too, but you’re going to have to buy a separate online subscription if s/he wants to read it online. That doesn’t feel right.

The pay-per-use argument may be fair but it flies in the face of how we all know culture works. Culture only exists if we share what matters to us. There is no culture without this. That’s why it’s so important I can share a physical book with you, or can send you a copy of a magazine article that I think you’ll like. Culture is the sharing of creative works and the conversations we have about them.

That’s why the creators of the US Constitution put a time limit on copyright. Yes, it feels unfair if after fourteen years (the original length of copyright protection) someone publishes my book without my permission and doesn’t give me any of the profits. Sure. But fairness is not the only criterion.

Culture cannot flourish or perhaps even exist when everything has a fair price.

1 Comment »