Joho the Blogmachine learning Archives - Joho the Blog

October 27, 2017

[liveblog] Nathan Matias on The Social impact of real-time algorithm decisions

J. Nathan Matias is giving a talk at the weekly AI session held by MIT Media Lab and Harvard’s Berkman Klein Center for Internet & Society. The title is: Testing the social impact of real-time algorithm decisions. (SPOILER: Nate is awesome.) Nathan will be introducing CivilServant.io to us, a service for researching the effects of tech and how it can be better directed to toward the social outcomes we (the civil society “we”) desire. (That’s my paraphrase.)

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

In 2008, the French government approved a law against Web sites that encourage anorexia and bulimia. In 2012, Instagram responded to pressure to limit hashtags that “actively promote self-harm.” Instagram had 40M users, almost as many as France’s 55M active Net users. Researchers at Georgia Tech several years later found that some self-harm sites on Instagram had higher engagement after Instagram’s actions. “ If your algorithm reliably detects people who are at risk of committing suicide, what next? ” If your algorithm reliably detects people who are at risk of committing suicide, what next? If the intervention isn helpful, your algorithm is doing harm.

Nathan shows a two-axis grid for evaluating algorithms: fair-unfair and benefits-harms. Accuracy should be considered to be on the same axis as fairness because it can be measured mathematically. But you can’t test the social impact without putting it into the field. “I’m trying to draw attention to the vertical axis [harm-benefit].”

We often have in mind a particular pipeline: training > model > prediction > people . Sometimes there are rapid feedback loops where the decisions made by people feed back into the model. A judicial system’s prediction risk scores may have no such loop. But the AI that manages a news feed is probably getting the readers’ response as data that tunes the model.

We have organizations that check the quality of items we deal with: UL for electrical products, etc. But we don’t have that sort of consumer protection for social tech. The results are moral panics, bad policies, etc. This is the gap Nate is trying to fill with CivilServant.io, a project supported by the Media Lab and GlobalVoices.

Here’s an example of one of CivilServant’s projects:

Managing fake news is essential for democracy. The social sciences have been dealing with this for quite a while by doing research on individual perception and beliefs, on how social context and culture influence beliefs … and now on algorithms that make autonomous decisions that affect us as citizens e.g., newsfeeds. Newsfeeds work this way: someone posts a link. People react to it, e.g. upvote, discuss, etc. The feed service watches that behavior and uses it to promote or demote the item. And then it feeds back in.

We’ve seen lots of examples of pernicious outcomes of this. E.g., at Reddit an early upvote can have dramatic impact on its ratings over time.

What can we do to govern online misinfo? We could surveill and censor. We could encourage counter-speech. We can imagine some type of algorithmic governance. We can use behavioral nudges, e.g. Facebook tagging articles as “disputed.” But all of these assume that these interventions change behaviors and beliefs. Those assumptions are not always tested.

Nate was approached by /r/worldnews at Reddit, a subreddit with14M subscribers and 70 moderators. At Reddit, moderating can be a very time consuming effort. (Nate spoke to a Reddit mod who had stopped volunteering at a children’s hospital in order to be a mod because she thought she could do more good that way.) This subreddit’s mods wanted to know if they could question the legitimacy of an item without causing it to surge on the platform. Fact-checking a post could nudge Reddit’s AI to boost its presence because of the increased activity.

So, they did an experiment asking people to fact check an article, or fact check and downvote if you can’t verify it. They monitored the ranking of the articles by Reddit for 3 months. [Nate now gives some math. Sorry I can’t capture (or understand) it.] The result: to his surprise, “encouraging fact checking reduced the average rank position of an article”encouraging fact checking reduced the average rank position of an article. Encouraging fact checking and down-voting reduced the spread of inaccurate news by Reddit’s algorithms. [I’m not confident I’m getting that right

Why did encouraging fact checking reduce rankings, but fact checking and voting did not? The mods think this might be because it gave users a constructive way to handle articles from reviled sources, reducing the number of negative comments about them. [I hope I’m getting this right.] Also, “reactance” may have nudged people to upvote just to spite the instructions. Also, users may have mobilized friends to vote on the artciles. Also, encouraging two tasks (fact check and then vote) rather than one may have influenced he timing of the algorithm, making the down-votes less impactful.

This is what Nate calls an “AI-Nudge”: a “second-order effect of influencing human behavior on the behavior of an algorithmic system.” It means you have to think about how humans interact with AI.

Often when people are working on AI, they’re starting from computer science and math. The question is: how can we use social science methods to research the effect of AI? Paluck and Cialdini see a cycle of Pilot/Lab experiments > qualitative methods > field experiences > theory / policy / design. In the Reddit example, Nathan spent considerable time with the community to understand their issues and how they interact with the AI.

Another example of a study: identifying and reducing side-effects of automated copyright law enforcement on Twitter. When people post something to Twitter, bots monitor it to see if violates copyright, resulting in a DMCA takedown notice being issued. Twitter then takes it down. The Lumen Project from BKC archives these notices. The CivilService project observes those notices in real time to study the effects. E.g., “a user’s tweets per day tends to drop after they receive a takedown notice … for a 42-day period”a user’s tweets per day tends to drop after they receive a takedown notice, and then continues dropping throughout the 42-day period they researched. Why this long-term decrease in posting? Maybe fear and risk. Maybe awareness of surveillance.

So, how can these chilling effects be reduced? The CivilService project automatically sends users info about their rights and about surveillance. The results of this intervention are not in yet. The project hopes to find ways to lessen the public’s needless withdrawal from social media. The research can feed empirical legal studies. Policymakers might find it useful. Civil rights orgs as well. And the platforms themselves.

In the course of the Q&As, Nathan mentions that he’s working on ways to explain social science research that non-experts can understand. CivilService’s work is with user communities and it’s developed a set of ways for communicating openly with the users.

Q: You’re trying to make AI more fair…

A: I’m doing consumer protection, so as experts like you work on making AI more fair, we can see the social effects of interventions. But there are feedback loops among them.

Q: What would you do with a community that doesn’t want to change?

A: We work with communities that want our help. In the 1970s, Campbell wrote an essay: “The Experimenting Society.” He asked if by doing behavioral research we’re becoming an authoritarian society because we’re putting power in the hands of the people who can afford to do the research. He proposed enabling communities to do their own studies and research. He proposed putting data scientists into towns across the US, pool their research, and challenge their findings. But this was before the PC. Now it’s far more feasible.

Q: What sort of pushback have you gotten from communities?

A: Some decide not to work with us. In others, there’s contention about the shape of the project. Platforms have changed how they view this work. Three years ago, the platforms felt under siege and wounded. That’s why I decided to create an independent organization. The platforms have a strong incentive to protect their reputations.

Be the first to comment »

October 19, 2017

[liveblog] AI and Education session

Jenn Halen, Sandra Cortesi, Alexa Hasse, and Andres Lombana Bermudez of the Berkman Klein Youth and Media team are leading about a discussion about AI and Education at MIT Media Lab as part of the Ethics and Governance of AI program jointly at the Harvard’s Berkman Klein Center for Internet & Society and the MIT Media Lab.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Sandra gives an introduction the BKC Youth and Media project. She points out that their projects are co-designed with the groups that they are researching. From the AI folks they’d love ideas and better understanding of AI, for they are just starting to consider the importance of AI to education and youth. They are creating a Digital Media Literacy Platform (which Sandra says they hope to rename).

They show an intro to AI designed to be useful for a teacher introducing the topic to students. It defines, at a high level, AI, machine learning, and neural networks. They also show “learning experiences” (= “XP”) that Berkman Klein summer interns came up with, including AI and well-being, AI and news, autonomous vehicles, and AI and art. They are committed to working on how to educate youth about AI not only in terms of particular areas, but also privacy, safety, etc., always with an eye towards inclusiveness.

They open it up for discussion by posing some questions. 1. How to promote inclusion? How to open it up to the most diverse learning communities? 2. Did we spot any errors in their materials? 3. How to reduce the complexity of this topic? 4. Should some of the examples become their own independent XPs? 5. How to increase engagement? How to make it exciting to people who don’t come into it already interested in the topic?

[And then it got too conversational for me to blog…]

Be the first to comment »

October 10, 2017

[liveblog][bkc] Algorithmic fairness

I’m at a special Berkman Klein Center Tuesday lunch, a panel on “Programming the Future of AI: Ethics, Governance, and Justice” with Cynthia Dwork, Christopher L. Griffin, Margo I. Seltzer, and Jonathan L. Zittrain, in a discussion moderated by Chris Bavitz.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

They begin with brief intros of their interests:

Chris Griffin: One of the big questions for use of algorithms in the justice system is what: is the alternative? Human decision making has its own issues.

Margo Seltzer: She’s been working on transparent models. She would always prefer to be able to get an index card’s worth of explanation of how a machine learning system has come up with its output.

Cynthia Dwork: What is our definition of fairness, and how might we evaluate the fairness of our machine systems? She says she’s not that big a fan of insisting on explanations.

Jonathan Zittrain: What elements of this ought to be contracted out? Can we avoid the voting machine problem of relying on a vendor we don’t necessarily trust? Also, it may be that expalantions don’t help us that much. Also, we have to be very wary of biases built into the data. Finally, AI might be able to shed light on interventions before problems arise, e.g., city designs that might lower crime rates.

Chris Bavitz: Margo, say more about transparency…

Seltzer: Systems ought to be designed so that if you ask why it came up with that conclusion, it can tell you in a way that you can understand. Not just a data dump.

Bavitz: The legal system generally expects that, but is that hard to do?

Seltzer: It seems that in some cases you can achieve higher accuracy with models that are not explicable. But not always.

Dwork: Yes.

Zittrain: People like Cynthia Rudin have been re-applying techniques from the 1980s but are explainable. But I’ve been thinking about David Weinberger’s recent work [yes, me] that reality may depend on factors that are deeply complex and that don’t reduce down to understandable equations.

Dwork: Yes. But back to Margo. Rule lists have antecedents and probabilities. E.g., you’re trying to classify mushrooms as poisonous or not. There are features you notice: shape of the head, odor, texture, etc. You can generate rules lists that are fairly simple: if the stalk is like this and the smell is like, then it’s likely poisonous. But you can also have “if/else” conditions. The conclusions can be based on very complex dependencies among these factors. So, the question of why something was classified some way can be much more complicated than meets the eye.

Seltzer: I agree. Let’s say you were turned down for the loan. You might not be able to understand the complex of factors, but you might be able to find a factor you can address.

Dwork: Yes, but the question “Is there a cheap and easy path that would lead to a different outcome?” is a very different quesiton than “Why I was classified some particular way?””

Griffin: There’s a multi-level approach to assessing transparency. We can’t expect the public to understand the research by which a model is generated. But how is that translated into scoring mechanisms? What inputs are we using? If you’re assessing risk from 1 to 6, does the decision-maker understand the difference between, say, a 2 and 3?

Zittrain: The data going in often is very reductive. You do an interview with a prisoner who doesn’t really answer so you take a stab at it … but the stabbiness of that data is not itself input. [No, Zittrain did not say “stabbiness”].

Griffin: The data quality issue is widespread. In part this is because the data sets are discrete. It would be useful to abstract ID’s so the data can be aggregated.

Zittrain: Imagine you can design mushrooms. You could design a poisonous one with the slightest variation from edible ones to game the system. A real life example: the tax system. I think I’d rather trust machine learning than a human model that can be more easily gamed.

Bavitz: An interviewer who doesn’t understand the impact of the questions she’s asking might be a feature, not a bug, if you want to get human bias out of the model…

Seltzer: The suspicion around machine algorithms stems from a misplaced belief that humans are fair and unbiased. The combination of a human and a machine, if the human can understand the machine’s model, might result in less biased decisions than either on their own.

Bavitz: One argument for machine learning tools is consistency.

Griffin: The ethos of our system would be lost. We rely on a judicial official to use her or his wisdom, experience, and discretion to make decisions. “Bias could be termed as the inability to perceive with sufficient clarity.” [I missed some of this. Sorry.]

Bavitz: If the data is biased, can the systems be trained out of the bias?

Dwork: Generally, garbage in, garbage out. There are efforts now, but they’re problematic. Maybe you can combine unbiased data with historical data, and use that to learn models that are less biased.

Griffin: We’re looking for continuity in results. With the prisoner system, the judge gets a list of the factors lined up with the prisoner’s history. If the judge wants to look at that background and discard some of the risk factors because they’re so biased, s/he can ignore the machine’s recommendation. There may be some anchoring bias, but I’d argue that that’s a good thing.

Bavitz: How about the private, commercial actors who are providing this software? What if these companies don’t want to make their results interpretable so as not to give away their special sauce?

Dwork: When Facebook is questioned, I like to appeal to the miracle of modern cryptography that lets us prove that secrets have particular properties without decrypting them. This can be applied to algorithms so you can show that one has a particular property without revealing that algorithm itself. There’s a lot of technology out there that can be used to preserve the secrecy of the algorithm, if that were the only problem.

Zittrain: It’d be great to be able to audit a tech while keeping the algorithm secret, but why does the company want to keep it secret? Especially if the results of the model are fed back in, increasing lock-in. I can’t see why we’d want to farm this out to commercial entities. But that hasn’t been on the radar because entrepreneurial companies are arising to do this for municipalities, etc.

Seltzer: First, the secrecy of the model is totally independent from the business model. Second, I’m fine with companies building these models, but it’s concerning if they’re keeping the model secret. Would you take a pill if you had no idea how it worked?

Zittrain: We do that all the time.

Dwork: That’s an example of relying on testing, not transparency.

Griffin: Let’s say we can’t get the companies to reveal the algorithms or the research. The public doesn’t want to know (unless there’s litigation over a particular case) the reasoning behind the decision, but whether it works.

Zittrain: Assume re-arrest rates are influenced by factors that shouldn’t count. The algorithm would reflect that. What can we do about that?

Griffin: The evidence is overwhelming about the disparity in stops by race and ethnicity. The officers are using the wrong proxies for making these decisions. If you had these tools throughout the lifespan of such a case, you might be able to change this. But these are difficult issues.

Seltzer: Every piece of software has bugs. The thought of sw being used in way where I don’t know what it thinks it’s doing or what it’s actually doing gives me a lot of pause.

Q&A

Q: The government keeps rehiring the same contractors who fail at their projects. The US Digital Service insists that contractors develop their sw in public. They fight this. Second, many engineering shops don’t think about the bias in the data. How do we infuse that into companies?

Dwork: I’m teaching it in a new course this semester…

Zittrain: The syllabus is secret. [laughter]

Seltzer: We inject issues of ethics into our every CS course. You have to consider the ethics while you’re designing and building the software. It’s like considering performance and scalability.

Bavitz: At the Ethics and Governance of AI project at the Berkman Klein Center, we’ve been talking about the point of procurement: what do the procurers need to be asking?

Q: The panel has talked about justice, augmenting human decision-making, etc. That makes it sound like we have an idea of some better decision-making process. What is it? How will we know if we’ve achieved it? How will models know if they’re getting it right, especially over time as systems get older?

Dwork: Huge question. Exactly the right question. If we knew who ought to be treated similarly to whom for any particular classification class, everything would become much easier. A lot of AI’s work will be discovering this metric of who is similar to whom, and how similar. It’s going to be an imperfect but improving situation. We’ll be doing the best guess, but as we do more and more research, our idea of what is the best guess will improve.

Zittrain: Cynthia, your work may not always let us see what’s fair, but it does help us see what is unfair. [This is an important point. We may not be able to explain what fairness is exactly, but we can still identify unfairness.] We can’t ask machine learning pattern recognition to come up with a theory of justice. We have to rely on judges, legislators, etc. to do that. But if we ease the work of judges by only presenting the borderline cases, do we run the risk of ossifying the training set on which the judgments by real judges were made? Will the judges become de-skilled? Do you keep some running continuously in artesinal courtrooms…? [laughter]

Griffin: I don’t think that any of these risk assessments can solve any of these optimization problems. That takes a conversation in the public sphere. A jurisdiction has to decide what its tolerance for risk is, what it’s tolerance is for the cost of incarceration, etc. The tool itself won’t get you to that optimized outcome. It will be the interaction of the tool and the decision-makers. That’s what gets optimized over time. (There is some baseline uniformity across jurisdictions.)
Q: Humans are biased. Assume a normal distribution across degrees of bias. AI can help us remove the outliers, but it may rely on biased data.

Dwork: I believe this is the bias problem we discussed.

Q: Wouldn’t be better to train it on artificial data?

Seltzer: Where does that data come from? How do we generate realistic but unbiased data?

Be the first to comment »

September 26, 2017

[liveblog][pair] Blaise Agüera y Arcas on the source of bias

At the PAIR Symposium, Google’s Blaise Agüera y Arcas is providing some intellectual and historical perspective on AI issues.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

[Note: This is a talk tough to live-blog because it is carefully structured intellectually. My apologies.]

He says neural networks have been part of the computing environment from the beginning. E.g., he thinks that the loop at the end of the logic gate symbol in fact comes from a 1943 symbolization of biological neural networks. There are indications of neural networks in Turing’s early papers. So these ideas go way back. Blaise thinks that the majority of computing processes in a few years will be running on processors designed for running neural networks.

ML has raised anxiety reminiscent of Walter Benjamin’s concern — he cites The Work of Art in the Age of Mechanical Reproduction — about the mass reproduction of art that strips it of its aura. Now there’s the same kind of moral panic about art and human exceptionalism and existence. (Cf. Nick Bostrom’s SuperIntelligence). It reminds him of Jakob Mohr’s 1910 The Influencing Machine in which schizophrenics believe they’re being influenced by an external machine. (They always thought men were managing the machine.) He points to what he calls Bostrom’s ultimate colonialism, in which we are able to populate the universe with 10^58 human minds. [Sorry, but I didn’t get this. My fault.] He ties this to Bacon’s reverence for the domination of nature. Blaise prefers a feminist view, citing Kember & Zylinksa’s Life After New Media.

Many say we have a value alignment problem, he says: how do we make AI that embeds human values? But AI systems do have human values because they’re trained on human data. The problem is that our human values are off. He references a paper on judging criminality based on faces. The paper claims it’s free of human biases. But it’s based on data that is biased. Nevertheless, this sort of tech is being commercialized. E.g., Faception claims to classify people based on their faces: High IQ, Pedophile, etc.

Also, there’s the recent paper about a ML system classifies one’s gender preferences based on faces. Blaise ran a test on Mechanical Turk asking about some of the features in the composite gay and straight faces in that paper. He found that people attracted to the same sex were more likely to wear glasses. There were also significant differences in facial hair, use of makeup, and face tan, features also in the composite faces. Thus, the ML system might have been using social markers, not physiognomy, “There are a lot of tells.”

In conclusion, none of these are arguments against ML. On the contrary. The biases and prejudices, and the social signalling, are things ML lets us hold a mirror up to.

2 Comments »

[liveblog][pair] Golan Levin

At the PAIR Symposium, Golan Levin of CMU is talking about ML and art.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

The use of computers for serendipitous creativity has been a theme of computer science since its beginning, Golan says. The job of AI should be serendipity and creativity. He gives examples of his projects.

Put your hand up to a scanner and it shows you hand with an extra finger. Or with extra hands at the end of your fingers.

Augmented Hand Series (v.2), Live Screen Recordings from Golan Levin on Vimeo.

[He talks very very quickly. I’ll have to let the project videos talk for themselves. Sorry.]

Terrapattern provides orbital info about us. It’s an open source neural network tool which offers similar-image search for satellite imagery. It’s especially good at finding “soft” structures often not noted on maps. E.g., click on a tennis court and it will find you all of them in the area. Click on crossroads, same thing.

Terrapattern (Overview & Demo) from STUDIO for Creative Inquiry on Vimeo.

This is, he says, an absurdist tool of serendipity. But it also democratizes satellite intelligence. His favorite example: finding all the rusty boats floating in NYC harbor.

Next he talks about our obsession with “masterpieces.” Will a computer ever be able to create masterpiece, he keeps getting asked. But artworks are not in-themselves. They exist in relationship to their audience. (He recommends When the Machine Made Art by Grant D. Taylor.)

Optical illusions get us to see things that aren’t there. “Print on paper beats brain.” We see faces in faucets and life in tree trunks. “This is us deep dreaming.” The people who understand this best are animators. See The illusion of Life, a Disney book about how to make things seem alive.

The observer is not separate from the object observed. Artificial intelligence occurs in the mind as well as in the machine.

He announces a digression: “Some of the best AI-enabled art is being made by engineers,” as computer art was made by early computer engineers.

He points to the color names ML-generated by Janelle Shane. And Gabriel Goh’s synthetic porn. It uses Yahoo’s porn detector and basically runs it in reverse starting with white noise. “This is conceptual art of the highest order.”

“I’m frankly worried, y’all,” he says. People use awful things using imaging technology. E.g., face tracking can be abused by governments and others. These apps are developed to make decisions. And those are the thoughtless explicit abuses, not to mention implicit biases like HP’s face scanning software that doesn’t recognize black faces. He references Zeynep Tufecki’s warnings.

A partial, tiny, and cost-effective solution: integrate artists into your research community. [He lists sensible reasons too fast for me to type.]

Be the first to comment »

[liveblog][PAIR] Rebecca Fiebrink on how machines can create new things

At the PAIR symposium, Rebecca Fiebrink of Goldsmiths University of London asks how machines can create new things.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

She works with sensors. ML can allow us to build new interactions from examples of human action and computer response. E.g., recognize my closed fist and use it to play some notes. Add more gestures. This is a conventional suprvised training framework. But suppose you want to build a new gesture recognizer?

The first problem is the data set: there isn’t an obvious one to use. Also, would a 99% recognition rate be great or not so much? It depends on what was happening. IF it goes wrong, you modify the training examples.

She gives a live demo — the Wekinator — using a very low-res camera (10×10 pixels maybe) image of her face to control a drum machine. It learns to play stuff based on whether she is leaning to the left or right, and immediately learns to change if she holds up her hand. She then complicates it, starting from scratch again, training it to play based on her hand position. Very impressive.

Ten years ago Rebecca began with the thought that ML can help unlock the interactive potential of sensors. She plays an early piece by Anne Hege using Playstation golf controllers to make music:

Others make music with instruments that don’t look normal. E.g., Laetitia Sonami uses springs as instruments.

She gives other examples. E.g., a facial expression to meme system.

Beyond building new things, what are the consequences, she asks?

First, faster creation means more prototyping and wider exploration, she says.

Second, ML opens up new creative roles for humans. For example, Sonami says, playing an instrument now can be a bit wild, like riding a bull.

Third, ML lets more people be creators and use their own data.

Rebecca teaches a free MOC on Kadenze
: Machine learning for artists and musicians.

Be the first to comment »

[liveblog][PAIR] Doug Eck on creativity

At the PAIR Symposium, Doug Eck, a research scientist at Google Magenta, begins by playing a video:

Douglas Eck – Transforming Technology into Art from Future Of StoryTelling on Vimeo.

Magenta is part of Google Brain that explores creativity.
By the way:

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

He talks about three ideas Magenta has come to for “building a new kind of artist.”

1. Get the right type of data. It’s important to get artists to share and work with them, he says.

Magenta has been trying to get neural networks to compose music. They’ve learned that rather than trying to model musical scores, it’s better to model performances captured as MIDI. They have tens of thousands of performances. From this they were able to build a model that tries to predict the piano roll view of the music. At any moment, should the AI stay at the same time, stacking up notes into chords, or move forward? What are the next notes? Etc. They are not yet capturing much of the “geometry” of, say, Chopin: the piano-roll-ish vision of the score. (He plays music created by ML trained on scores and one trained on performances. The score-based on is clipped. The other is far more fluid and expressive.)

He talks about training ML to draw based on human drawings. He thinks running human artists’ work through ML could point out interesting facets of them.

He points to the playfulness in the drawings created by ML from simple human drawings. ML trained on pig drawings interpreted a drawing of a truck as pig-like.

2. Interfaces that work. Guitar pedals are the perfect interface: they’re indestructible, clear, etc. We should do that for AI musical interfaces, but the sw is so complex technically. He points to the NSyth sound maker and AI duet from Google Creative Lab. (He also touts deeplearn.js.)

3. Learning from users. Can we use feedback from users to improve these systems?

He ends by pointing to the blog, datasets, discussion list, and code at g.co/magenta.

Be the first to comment »

[lliveblog][PAIR] Antonio Torralba on machine vision, human vision

At the PAIR Symposium, Antonio Torralba asks why image identification has traditionally gone so wrong.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

If we train our data on Google Images of bedrooms, we’re training on idealized photos, not real world. It’s a biased set. Likewise for mugs, where the handles in images are almost all on the right side, not the left.

Another issue: The CANNY edge detector (for example) detects edges and throws a black and white reduction to the next level. “All the information is gone!” he says, showing that a messy set of white lines on black is in fact an image of a palace. [Maybe the White House?] (A different example of edge detection:)

/div>

Deep neural networks work well, and can be trained to recognize places in images, e.g., beach. hotel room, street. You train your neural net and it becomes a black box. E.g., how can it recognize that a bedroom is in fact a hotel room? Maybe it’s the lamp? But you trained it to recognize places, not objects. It works but we don’t know how.

When training a system on place detection, we found some units in some layers were in fact doing object detection. It was finding the lamps. Another unit was detecting cars, another detected roads. This lets us interpret the neural networks’ work. In this case, you could put names to more than half of the units.

How to quantify this? How is the representation being built? For this: Network dissection. This shows that when training a network on places, objects emerges. “The network may be doing something more interesting than your task.”The network may be doing something more interesting than your task: object detection is harder than place detection.

We currently train systems by gathering labeled data. But small children learn without labels. Children are self-supervised systems. So, take in the rgb values of frames of a movie, and have the system predict the sounds. When you train a system this way, it kind of works. If you want to predict the ambient sounds of a scene, you have to be able to recognize the objects, e.g., the sound of a car. To solve this, the network has to do object detection. That’s what they found when they looked into the system. It was doing face detection without having been trained to do that. It also detects baby faces, which make a different type of sound. It detects waves. All through self-supervision.

Other examples: On the basis of one segment, predict the next in the sequence. Colorize images. Fill in an empty part of an image. These systems work, and do so by detecting objects without having been trained to do so.

Conclusions: 1. Neural networks build represntations that are sometimes interpretatble. 2. The rep might solve a task that’s evem ore interesting than the primary task. 3. Understanding how these reps are built might allow new approaches for unsupervised or self-supervised training.

Be the first to comment »

[liveblog][PAIR] Maya Gupta on controlling machine learning

At the PAIR symposium. Maya Gupta runs Glass Box at Google, which looks at black box issues. She is talking about how we can control machine learning to do what we want

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

The core idea of machine learning are its role models, i.e., its training data. That’s the best way to control machine learning. She’s going to address by looking at the goals of controlling machine learning.

A simple example of monotinicity. Let’s say we’re tring to recommend nearby coffee shops. So we use data about the happiness of customers and distance from the shop. We can fit the model ot a linear model. Or we can fit it to a curve, which works better for nearby shops but goes wrong for distant shops. That’s fine for Tokyo but terrible for Montana because it’ll be sending people many miles away. A montonic example says we don’t want to do that. This controls ML to make it more useful. Conclusion: the best ML has the right examples and the right kinds of flexibility. [Hard to blog this without her graphics. Sorry.] See “Deep Lattice Networks for Learning Partial Monotonic Models,” NIPS 2017; it will soon by on the TensorFlow site.

“The best way to do things for practitioners is to work next to them”The best way to do things for practitioners is to work next to them.

A fairness goal: e.g., we want to make sure that accuracy in India is the same as accuracy in the US. So, add a constraint that says what accuracy levels we want. Math lets us do that.

Another fairness goal: the rate of positive classifications should be the same in India as in the US, e.g., rate of students being accepted to a college. In one example, there is an accuracy trade-off in order to get fairness. Her attitude: Just tell us what you want and we’ll do it

Fairness isn’t always relative. E.g., E.g., minimize classification errors differently for different regions. You can’t always get what you want, but you sometimes can or can get close. [paraphrase!] See fatml.org

It can be hard to state what we want, but we can look at examples. E.g., someone hand-labels 100 examples. That’s not enough as training date, but we can train the system so that it classifies those 100 at something like 95% accuracy.

Sometimes you want to improve an existing ML system. You don’t want to retrain because you like the old results. So, you can add in a constraint such as keep the differences from the original classifications to less than 2%.

You can put all of the above together. See “Satisfying Real-World Goals with Dataset Constraints,” NIPS, 2016. Look for tools coming to TensorFlow.

Some caveats about this approach.

First, to get results that are the same for men and women, the data needs to come with labels. But sometimes there are privacy issues about that. “Can we make these fairness goals work without labels? ”Can we make these fairness goals work without labels? Research so far says the answer is messy. E.g., if we make ML more fair for gender (because you have gender labels), it may also make it fairer for race.

Second, this approach relies on categories, but individuals don’t always fit into categories. But, usually if you get things right on categories, it usually works out well in the blended examples.

Maya is an optimist about ML. “But we need more work on the steering wheel.” We’re not always sure we want to go with this technology. And we need more human-usable controls.

Be the first to comment »

[liveblog][PAIR] Karrie Karahalios

At the Google PAIR conference, Karrie Karahalios is going to talk about how people make sense of their world and lives online. (This is an information-rich talk, and Karrie talks quickly, so this post is extra special unreliable. Sorry. But she’s great. Google her work.)

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Today, she says, people want to understand how the information they see comes to them. Why does it vary? “Why do you get different answers depending on your wifi network? ”Why do you get different answers depending on your wifi network? These algorithms also affect our personal feeds, e.g., Instagram and Twitter; Twitter articulates it, but doesn’t tell you how it decides what you will see

In 2012, Christian Sandvig and [missed first name] Holbrook were wondering why they were getting odd personalized ads in their feeds. Most people were unaware that their feeds are curated: only 38% were aware of this in 2012. Thsoe who were aware became aware through “folk theories”: non-authoritative explanations that let them make sense of their feed. Four theories:

1. Personal engagement theory: If you like and click on someone, the more of that person you’ll see in your feed. Some people were liking their friends’ baby photos, but got tired of it.

2. Global population theory: If lots of people like, it will show up on more people’s feeds.

3. Narcissist: You’ll see more from people who are like you.

4. Format theory: Some types of things get shared more, e.g., photos or movies. But people didn’t get

Kempton studied thermostats in the 1980s. People either thought of it as a switch or feedback, or as a valve. He looked at their usage patterns. Regardless of which theory, they made it work for them.

She shows an Orbitz page that spits out flights. You see nothing under the hood. But someone found out that if you use a Mac, your prices were higher. People started using designs that shows the seams. So, Karrie’s group created a view that showed the feed and all the content from their network, which was three times bigger than what they saw. For many, this was like awakening from the Matrix. More important, they realized that their friends weren’t “liking” or commenting because the algorithm had kept their friends from seeing what they posted.

Another tool shows who you are seeing posts from and who you are not. This was upsetting for many people.

After going through this process people came up with new folk theories. E.g., they thought it must be FB’s wisdom in stripping out material that’s uninteresting one way or another. [paraphrasing].

They let them configure who they saw, which led many people to say that FB’s algorithm is actually pretty good; there was little to change.

Are these folk theories useful? Only two: personal engagement and control panel, because these let you do something. But there are poor tweaking tools.

How to embrace folk theories: 1. Algorithm probes, to poke and prod. “It would be great, Karrie says, to have open APIs so people could create tools”(It would be great to have open APIs so people could create tools. FB deprecated it.) 2. Seamful interfaces to geneate actionable folk theories. Tuning to revert of borrow?

Another control panel UI, built by Eric Gilbert, uses design to expose the algorithms.

She ends with a wuote form Richard Dyer: “All technolgoies are at once technical and also always social…”

Be the first to comment »

Next Page »