Joho the Blogmachine learning Archives - Joho the Blog

August 13, 2017

Machine learning cocktails

Inspired by fabulously wrong paint colors that Janelle Shane’s generated by running existing paint names through a machine learning system, and then by an hilarious experiment in dog breed names by my friend Matthew Battles, I decided to run some data through a beginner’s machine learning algorithm by karpathy.

I fed a list of cocktail names in as data to an unaltered copy of karpathy’s code. After several hundred thousand iterations, here’s a highly curated list of results:

  • French Connerini Mot
  • Freside
  • Rumibiipl
  • Freacher
  • Agtaitane
  • Black Silraian
  • Brack Rickwitr
  • Hang
  • boonihat
  • Tuxon
  • Bachutta B
  • My Faira
  • Blamaker
  • Salila and Tonic
  • Tequila Sou
  • Iriblon
  • Saradise
  • Ponch
  • Deiver
  • Plaltsica
  • Bounchat
  • Loner
  • Hullow
  • Keviy Corpse der
  • KreckFlirch 75
  • Favoyaloo
  • Black Ruskey
  • Avigorrer
  • Anian
  • Par’sHance
  • Salise
  • Tequila slondy
  • Corpee Appant
  • Coo Bogonhee
  • Coakey Cacarvib
  • Srizzd
  • Black Rosih
  • Cacalirr
  • Falay Mund
  • Frize
  • Rabgel
  • FomnFee After
  • Pegur
  • Missoadi Mangoy Rpey Cockty e
  • Banilatco
  • Zortenkare
  • Riscaporoc
  • Gin Choler Lady or Delilah
  • Bobbianch 75
  • Kir Roy Marnin Puter
  • Freake
  • Biaktee
  • Coske Slommer Roy Dog
  • Mo Kockey
  • Sane
  • Briney
  • Bubpeinker
  • Rustin Fington Lang T
  • Kiand Tea
  • Malmooo
  • Batidmi m
  • Pint Julep
  • Funktterchem
  • Gindy
  • Mod Brandy
  • Kkertina Blundy Coler Lady
  • Blue Lago’sil
  • Mnakesono Make
  • gizzle
  • Whimleez
  • Brand Corp Mook
  • Nixonkey
  • Plirrini
  • Oo Cog
  • Bloee Pluse
  • Kremlin Colone Pank
  • Slirroyane Hook
  • Lime Rim Swizzle
  • Ropsinianere
  • Blandy
  • Flinge
  • Daago
  • Tuefdequila Slandy
  • Stindy
  • Fizzy Mpllveloos
  • Bangelle Conkerish
  • Bnoo Bule Carge Rockai Ma
  • Biange Tupilang Volcano
  • Fluffy Crica
  • Frorc
  • Orandy Sour
  • The candy Dargr
  • SrackCande
  • The Kake
  • Brandy Monkliver
  • Jack Russian
  • Prince of Walo Moskeras
  • El Toro Loco Patyhoon
  • Rob Womb
  • Tom and Jurr Bumb
  • She Whescakawmbo Woake
  • Gidcapore Sling
  • Mys-Tal Conkey
  • Bocooman Irion anlis
  • Ange Cocktaipopa
  • Sex Roy
  • Ruby Dunch
  • Tergea Cacarino burp Komb
  • Ringadot
  • Manhatter
  • Bloo Wommer
  • Kremlin Lani Lady
  • Negronee Lince
  • Peady-Panky on the Beach

Then I added to the original list of cocktails a list of Western philosophers. After about 1.4 million iterations, here’s a curated list:

  • Wotticolus
  • Lobquidibet
  • Mores of Cunge
  • Ruck Velvet
  • Moscow Muáred
  • Elngexetas of Nissone
  • Johkey Bull
  • Zoo Haul
  • Paredo-fleKrpol
  • Whithetery Bacady Mallan
  • Greekeizer
  • Frellinki
  • Made orass
  • Wellis Cocota
  • Giued Cackey-Glaxion
  • Mary Slire
  • Robon Moot
  • Cock Vullon Dases
  • Loscorins of Velayzer
  • Adg Cock Volly
  • Flamanglavere Manettani
  • J.N. tust
  • Groscho Rob
  • Killiam of Orin
  • Fenck Viele Jeapl
  • Gin and Shittenteisg Bura
  • buzdinkor de Mar
  • J. Apinemberidera
  • Nickey Bull
  • Fishomiunr Slmester
  • Chimio de Cuckble Golley
  • Zoo b Revey Wiickes
  • P.O. Hewllan o
  • Hlack Rossey
  • Coolle Wilerbus
  • Paipirista Vico
  • Sadebuss of Nissone
  • Sexoo
  • Parodabo Blazmeg
  • Framidozshat
  • Almiud Iquineme
  • P.D. Sullarmus
  • Baamble Nogrsan
  • G.W.J. . Malley
  • Aphith Cart
  • C.G. Oudy Martine ram
  • Flickani
  • Postine Bland
  • Purch
  • Caul Potkey
  • J.O. de la Matha
  • Porel
  • Flickhaitey Colle
  • Bumbat
  • Mimonxo
  • Zozky Old the Sevila
  • Marenide Momben Coust Bomb
  • Barask’s Spacos Sasttin
  • Th mlug
  • Bloolllamand Royes
  • Hackey Sair
  • Nick Russonack
  • Fipple buck
  • G.W.F. Heer Lach Kemlse Male

Yes, we need not worry about human bartenders, cocktail designers, or philosophers being replaced by this particular algorithm. On the other hand, this is algorithm consists of a handful of lines of code and was applied blindly by a person dumber than it. Presumably SkyNet — or the next version of Microsoft Clippy — will be significantly more sophisticated than that.

Be the first to comment »

May 15, 2017

[liveblog][AI] AI and education lightning talks

Sara Watson, a BKC affiliate and a technology critic, is moderating a discussion at the Berkman Klein/Media Lab AI Advance.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Karthik Dinakar at the Media Lab points out what we see in the night sky is in fact distorted by the way gravity bends light, which Einstein called a “gravity lens.” Same for AI: The distortion is often in the data itself. Karthik works on how to help researchers recognize that distortion. He gives an example of how to capture both cardiologist and patient lenses to better to diagnose women’s heart disease.

Chris Bavitz is the head of BKC’s Cyberlaw Clinic. To help Law students understand AI and tech, the Clinic encourages interdisciplinarity. They also help students think critically about the roles of the lawyer and the technologist. The clinic prefers early relationships among them, although thinking too hard about law early on can diminish innovation.

He points to two problems that represent two poles. First, IP and AI: running AI against protected data. Second, issues of fairness, rights, etc.

Leah Plunkett, is a professor at Univ. New Hampshire Law School and is a BKC affiliate. Her topic: How can we use AI to teach? She points out that if Tom Sawyer were real and alive today, he’d be arrested for what he does just in the first chapter. Yet we teach the book as a classic. We think we love a little mischief in our lives, but we apparently don’t like it in our kids. We kick them out of schools. E.g., of 49M students in public schools in 20-11, 3.45M were suspended, and 130,000 students were expelled. These disproportionately affect children from marginalized segments.

Get rid of the BS safety justification and the govt ought to be teaching all our children without exception. So, maybe have AI teach them?

Sarah: So, what can we do?

Chris: We’re thinking about how we can educate state attorneys general, for example.

Karthik: We are so far from getting users, experts, and machine learning folks together.

Leah: Some of it comes down to buy-in and translation across vocabularies and normative frameworks. It helps to build trust to make these translations better.

[I missed the QA from this point on.]

Be the first to comment »

[liveblog] AI Advance opening: Jonathan Zittrain and lightning talks

I’m at a day-long conference/meet-up put on by the Berkman Klein Center‘s and MIT Media Lab‘s “AI for the Common Good” project.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Jonathan Zittrain gives an opening talk. Since we’re meeting at Harvard Law, JZ begins by recalling the origins of what has been called “cyber law,” which has roots here. Back then, the lawyers got to the topic first, and thought that they could just think their way to policy. We are now at another signal moment as we are in a frenzy of building new tech. This time we want instead to involve more groups and think this through. [I am wildly paraphrasing.]

JZ asks: What is it that we intuitively love about human judgment, and are we willing to insist on human judgments that are worse than what a machine would come up with? Suppose for utilitarian reasons we can cede autonomy to our machines — e.g., autonomous cars — shouldn’t we? And what do we do about maintaining local norms? E.g., “You are now entering Texas where your autonomous car will not brake for pedestrians.”

“Should I insist on being misjudged by a human judge because that’s somehow artesinal?” when, ex hypothesis, an AI system might be fairer.

Autonomous systems are not entirely new. They’re bringing to the fore questions that have always been with us. E.g., we grant a sense of discrete intelligence to corporations. E.g., “McDonald’s is upset and may want to sue someone.”

[This is a particularly bad representation of JZ’s talk. Not only is it wildly incomplete, but it misses the through-line and JZ’s wit. Sorry.]

Lightning Talks

Finale Doshi-Velez is particularly interested in interpretable machine learning (ML) models. E.g., suppose you have ten different classifiers that give equally predictive results. Should you provide the most understandable, all of them…?

Why is interpretability so “in vogue”? Suppose non-interpretable AI can do something better? In most cases we don’t know what “better” means. E.g., someone might want to control her glucose level, but perhaps also to control her weight, or other outcomes? Human physicians can still see things that are not coded into the model, and that will be the case for a long time. Also, we want systems that are fair. This means we want interpretable AI systems.

How do we formalize these notions of interpretability? How do we do so for science and beyond? E.g., what is a legal “right to explanation
” mean? She is working with Sam Greshman on how to more formally ground AI interpretability in the cognitive science of explanation.

Vikash Mansinghka leads the eight-person Probabilistic Computing project at MIT. They want to build computing systems that can be our partners, not our replacements. We have assumed that the measure of success of AI is that it beats us at our own game, e.g., AlphaGo, Deep Blue, Watson playing Jeopardy! But games have clearly measurable winners.

His lab is working on augmented intelligence that gives partial solutions, guidelines and hints that help us solve problems that neither system could solve on their own. The need for these systems are most obvious in large-scale human interest projects, e.g., epidemiology, economics, etc. E.g., should a successful nutrition program in SE Asia be tested in Africa too? There are many variables (including cost). BayesDB, developed by his lab, is “augmented intelligence for public interest data science.”

Traditional computer science, computing systems are built up from circuits to algorithms. Engineers can trade off performance for interpretability. Probabilisitic systems have some of the same considerations. [Sorry, I didn’t get that last point. My fault!]

John Palfrey is a former Exec. Dir. of BKC, chair of the Knight Foundation (a funder of this project) and many other things. Where can we, BKC and the Media Lab, be most effective as a research organization? First, we’ve had the most success when we merge theory and practice. And building things. And communicating. Second, we have not yet defined the research question sufficiently. “We’re close to something that clearly relates to AI, ethics and government” but we don’t yet have the well-defined research questions.

The Knight Foundation thinks this area is a big deal. AI could be a tool for the public good, but it also might not be. “We’re queasy” about it, as well as excited.

Nadya Peek is at the Media Lab and has been researching “macines that make machines.” She points to the first computer-controlled machine (“Teaching Power Tools to Run Themselves“) where the aim was precision. People controlled these CCMs: programmers, CAD/CAM folks, etc. That’s still the case but it looks different. Now the old jobs are being done by far fewer people. But the spaces between doesn’t always work so well. E.g., Apple can define an automatiable workflow for milling components, but if you’re student doing a one-off project, it can be very difficult to get all the integrations right. The student doesn’t much care about a repeatable workflow.

Who has access to an Apple-like infrastructure? How can we make precision-based one-offs easier to create? (She teaches a course at MIT called “How to create a machine that can create almost anything.”)

Nathan Mathias, MIT grad student with a newly-minted Ph.D. (congrats, Nathan!), and BKC community member, is facilitating the discussion. He asks how we conceptualize the range of questions that these talks have raised. And, what are the tools we need to create? What are the social processes behind that? How can we communicate what we want to machines and understand what they “think” they’re doing? Who can do what, where that raises questions about literacy, policy, and legal issues? Finally, how can we get to the questions we need to ask, how to answer them, and how to organize people, institutions, and automated systems? Scholarly inquiry, organizing people socially and politically, creating policies, etc.? How do we get there? How can we build AI systems that are “generative” in JZ’s sense: systems that we can all contribute to on relatively equal terms and share them with others.

Nathan: Vikash, what do you do when people disagree?

Vikash: When you include the sources, you can provide probabilistic responses.

Finale: When a system can’t provide a single answer, it ought to provide multiple answers. We need humans to give systems clear values. AI things are not moral, ethical things. That’s us.

Vikash: We’ve made great strides in systems that can deal with what may or may not be true, but not in terms of preference.

Nathan: An audience member wants to know what we have to do to prevent AI from repeating human bias.

Nadya: We need to include the people affected in the conversations about these systems. There are assumptions about the independence of values that just aren’t true.

Nathan: How can people not close to these systems be heard?

JP: Ethan Zuckerman, can you respond?

Ethan: One of my colleagues, Joy Buolamwini, is working on what she calls the Algorithmic Justice League, looking at computer vision algorithms that don’t work on people of color. In part this is because the tests use to train cv systems are 70% white male faces. So she’s generating new sets of facial data that we can retest on. Overall, it’d be good to use test data that represents the real world, and to make sure a representation of humanity is working on these systems. So here’s my question: We find co-design works well: bringing in the affected populations to talk with the system designers?

[Damn, I missed Yochai Benkler‘s comment.]

Finale: We should also enable people to interrogate AI when the results seem questionable or unfair. We need to be thinking about the proccesses for resolving such questions.

Nadya: It’s never “people” in general who are affected. It’s always particular people with agendas, from places and institutions, etc.

Be the first to comment »

May 7, 2017

Predicting the tides based on purposefully false models

Newton showed that the tides are produced by the gravitational pull of the moon and the Sun. But, as a 1914 article in Scientific American pointed out, if you want any degree of accuracy, you have to deal with the fact that “the earth is not a perfect sphere, it isn’t covered with water to a uniform­ form depth, it has many continents and islands and sea passages of peculiar shapes and depths, the earth does not travel about the sun in a circular path, and earth, sun and moon are not always in line. The result is that two tides are rarely the same for the same place twice running, and that tides dif­fer from each other enormously in both times and in amplitude.”

So, we instead built a machine of brass, steel and mahogany. And instead of trying to understand each of the variables, Lord Kelvin postulated “a very respectable number” of fictitious suns and moons in various positions over the earth, moving in unrealistically perfect circular orbits, to account for the known risings and fallings of the tide, averaging readings to remove unpredictable variations caused by weather and “freshets.” Knowing the outcomes, he would nudge a sun or moon’s position, or add a new sun or moon, in order to get the results to conform to what we know to be the actual tidal measurements. If adding sea serpents would have helped, presumably Lord Kelvin would have included them as well.

The first mechanical tide-predicting machines using these heuristics were made in England. In 1881, one was created in the United States that was used by the Coast and Geodetic Survey for twenty-seven years.

Then, in 1914, it was replaced by a 15,000-piece machine that took “account of thirty-seven factors or components of a tide” (I wish I knew what that means) and predicted the tide at any hour. It also printed out the information rather than requiring a human to transcribe it from dials. “Unlike the human brain, this one cannot make a mistake.”

This new model was more accurate, with greater temporal resolution. But it got that way by giving up on predicting the actual tide, which might vary because of the weather. We simply accept the unpredictability of what we shall for the moment call “reality.” That’s how we manage in a world governed by uniform laws operating on unpredictably complex systems.

It is also a model that uses the known major causes of average tides — the gravitational effects of the sun and moon — but that feels fine about fictionalizing the model until it provides realistic results. This makes the model incapable of being interrogated about the actual causes of the tide, although we can tinker with it to correct inaccuracies. In this there is a very rough analogy — and some disanalogies — with some instances of machine learning.

Be the first to comment »

April 19, 2017

Alien knowledge

Medium has published my long post about how our idea of knowledge is being rewritten, as machine learning is proving itself to be more accurate than we can be, in some situations, but achieves that accuracy by “thinking” in ways that we can’t follow.

This is from the opening section:

We are increasingly relying on machines that derive conclusions from models that they themselves have created, models that are often beyond human comprehension, models that “think” about the world differently than we do.

But this comes with a price. This infusion of alien intelligence is bringing into question the assumptions embedded in our long Western tradition. We thought knowledge was about finding the order hidden in the chaos. We thought it was about simplifying the world. It looks like we were wrong. Knowing the world may require giving up on understanding it.

2 Comments »

October 11, 2016

[liveblog] First panel: Building intelligent applications with machine learning

I’m at the PAPIs conference. The opening panel is about building intelligent apps with machine learning. The panelists are all representing companies. It’s Q&A with the audience; I will not be able to keep up well.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

The moderator asks one of the panelists (Snejina Zacharia from Insurify) how AI can change a heavily regulated audience such as insurance. She replies that the insurance industry gets low marks for customer satisfaction, which is an opportunity. Also, they can leverage the existing platforms and build modern APIs on stop of them. Also, they can explore how to use AI in existing functions, e.g., chatbots, systems that let users just confirm their identification rather than enter all the data. They also let users pick from an AI-filtered list of carriers that are right for them. Also, personalization: predicting risk and adjusting the questionnaire based on the user’s responses.

Another panelist is working on mapping for a company that is not Google and that is owned by three car companies. So, when an Audi goes over a bump, and then a Mercedes goes over it, it will record the same data. On personalization: it’s ripe for change. People are talking about 100B devices being connected by 2020. People think that RFID tags didn’t live up to their early hype, but 10 billion RFID tags are going to be sold this year. These can provide highly personalized, higher relevant data. This will be the base for the next wave of apps. We need a standards body effort, and governments addressing privacy and security. Some standards bodies are working on it, e.g., Global Standards 1, which manages the barcodes standard.

Another panelist: Why is marketing such a good opportunity for AI and ML? Marketers used to have a specific skill set. It’s an art: writing, presenting, etc. Now they’re being challenged by tech and have to understand data. In fact, now they have to think like scientists: hypothesize, experiment, redo the hypothesis… And now marketers are responsible for revenue. Being a scientist responsible for predictable revenue is driving interest in AI and ML. This panelist’s company uses data about companies and people to segmentize following up on leads, etc. [Wrong place for a product pitch, IMO, which is a tad ironic, isn’t it?]

Another panelist: The question is: how can we use predictive intelligence to make our applications better? Layer input intelligence on top of input-programming-output. For this we need a platform that provides services and is easy to attach to existing processes.

Q: Should we develop cutting edge tech or use what Google, IBM, etc. offer?

A: It depends on whether you’re an early adopter or straggler. Regulated industries have to wait for more mature tech. But if your bread and butter is based on providing the latest and greatest, then you should use the latest tech.

A: It also depends on whether you’re doing a vertically integrated solution or something broader.

Q: What makes an app “smart”? Is it: Dynamic, with rapidly changing data?

A: Marketers use personas, e.g., a handful of types. They used to be written in stone, just about. Smart apps update the personas after ever campaign, every time you get new info about what’s going on in the market, etc.

Q: In B-to-C marketing, many companies have built the AI piece for advertising. Are you seeing any standardization or platforms on top of the advertising channels to manage the ads going out on them?

A: Yes, some companies focus on omni-channel marketing.

A: Companies are becoming service companies, not product companies. They no longer hand off to retailers.

A: It’s generally harder to automate non-digital channels. It’s harder to put a revenue number on, say, TV ads.

Be the first to comment »