Joho the Blogai Archives - Joho the Blog

May 6, 2018

[liveblog][ai] Primavera De Filippi: An autonomous flower that merges AI and Blockchain

Primavera De Filippi is an expert in blockchain-based tech. She is giving a ThursdAI talk on Plantoid, an event held by Harvard’s Berkman Klein Center for Internet & Society and the MIT Media Lab. Her talk is officially on operational autonomy vs. decisional autonomy, but it’s really about how weird things become when you build a computerized flower that merges AI and the blockchain. For me, a central question of her talk was: Can we have autonomous robots that have legal rights and can own and spend assets, without having to resort to conferring personhood on them the way we have with corporations?

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Autonomy and liability

She begins by pointing to the 3 industrial revolutions so far: Steam led to mechanized production ; Electricity led to mass production; Electronics led to automated production. The fourth — AI — is automating knowledge production.

People are increasingly moving into the digital world, and digital systems are moving back into the physical worlds, creating cyber-physical systems. E.g., the Internet of Things senses, communicates, and acts. The Internet of Smart Things learns from the data the things collect, makes inferences, and then acts. The Internet of Autonomous Things creates new legal challenges. Various actors can be held liable: manufacturer, software developer, user, and a third party. “When do we apply legal personhood to non-humans?”

With autonomous things, the user and third parties become less liable as the software developer takes on more of the liability: There can be a bug. Someone can hack into it. The rules that make inferences are inaccurate. Or a bad moral choice has led the car into an accident.

The sw developer might have created bug-free sw but its interaction with other devices might lead to unpredictability; multiple systems operating according to different rules might be incompatible; it can be hard to identify the chain of causality. So, who will be liable? The manufacturers and owners are likely to have only limited liability.

So, maybe we’ll need generalized insurance: mandatory insurance that potentially harmful devices need to subscribe to.

Or, perhaps we will provide some form of legal personhood to machines so the manufacturers can be sued for their failings. Suing a robot would be like suing a corporation. The devices would be able to own property and assets. The EU is thinking about creating this type of agenthood for AI systems. This is obviously controversial. At least a corporation has people associated with it, while the device is just a device, Primavera points out.

So, when do we apply legal personhood to non-humans? In addition to people and corporations, some countries have assigned personhood to chimpanzees (Argentina, France) and to natural resources (NZ: Whanganui river). We do this so these entities will have rights and cannot be simply exploited.

If we give legal personhood to AI-based systems, can AI have property rights over their assets and IP? If they are legally liable, they can be held responsible for their actions, and can be sued for compensation? “Maybe they should have contractual rights so they can enter into contracts. Can they be rewarded for their work? Taxed?”Maybe they should have contractual rights so they can enter into contracts. Can they be rewarded for their work? Taxed? [All of these are going to turn out to be real questions. … Wait for it …]

Limitations: “Most of the AI-based systems deployed today are more akin to slaves than corporations.” They’re not autonomous the way people are. They are owned, controlled and maintained by people or corporations. They act as agents for their operators. They have no technical means to own or transfer assets. (Primavera recommends watching the Star Trek: The Next Generation episode “The Measure of the Man” that asks, among other things, whether Data (the android) can be dismantled and whether he can resign.)

Decisional autonomy is the capacity to make a decision on your own, but it doesn’t necessarily bring what we think of as real autonomy. E.g., an AV can decide its route. For real autonomy we need operational autonomy: no one is maintaining the thing’s operation at a technical level. To take a non-random example, a blockchain runs autonomously because there is no single operator controlling. E.g., smart contracts come with a guarantee of execution. Once a contract is registered with a blockchain, no operator can stop it. This is operational autonomy.

Blockchain meets AI. Object: Autonomy

We are getting first example of autonomous devices using blockchain. The most famous is the Samsung washing machine that can detect when the soap is empty, and makes a smart contract to order more. Autonomous cars could work with the same model; they could not be owned by anyone and collect money when someone uses them. These could be initially purchased by someone and then buy themselves off: “They’d have to be emancipated,” she says. Perhaps they and other robots can use the capital they accumulate to hire people to work for them. [Pretty interesting model for an Uber.]

She introduces Plantoid, a blockchain-based life form. “Plantoid is autonomous, self-sufficient, and can reproduce.”It’s autonomous, self-sufficient, and can reproduce. Real flowers use bees to reproduce. Plantoids use humans to collect capital for their reproduction. Their bodies are mechanical. Their spirit is an Ethereum smart contract. It collects cryptocurrency. When you feed it currency it says thank you; the Plantoid Primavera has brought, nods its flower. When it gets enough funds to reproduce itself, it triggers a smart contract that activates a call for bids to create the next version of the Plantoid. In the “mating phase” it looks for a human to create the new version. People vote with micro-donations. Then it identifies a winner and hires that human to create the new one.

There are many Plantoids in the world. Each has its own “DNA”. New artists can add to it. E.g., each artist has to decide on its governance, such as whether it will donate some funds to charity. The aim is to make it more attractive to be contributed to. The most fit get the most money and reproduces themselves. BurningMan this summer is going to feature this.

Every time one reproduces, a small cut is given to the pattern that generated it, and some to the new designer. This flips copyright on its head: the artist has an incentive to make her design more visible and accessible and attractive.

So, why provide legal personhood to autonomous devices? We want them to be able to own their own assets, to assume contractual rights, and legal capacity so they can sue and be sued, and limit their liability. “ Blockchain lets us do that without having to declare the robot to be a legal person.” Blockchain lets us do that without having to declare the robot to be a legal person.

The plant effectively owns the cryptofunds. The law cannot affect this. Smart contracts are enforced by code

Who are the parties to the contract? The original author and new artist? The master agreement? Who can sue who in case of a breach? We don’t know how to answer these questions yet.

Can a plantoid sure for breach of contract? Not if the legal system doesn’t recognize them as legal persons. So who is liable if the plant hurts someone? Can we provide a mechanism for this without conferring personhood? “How do you enforce the law against autonomous agents that cannot be stopped and whose property cannot be seized?”


Could you do this with live plants? People would bioengineer them…

A: Yes. Plantoid has already been forked this way. There’s an idea for a forest offering trees to be cut down, with the compensation going to the forest which might eventually buy more land to expand itself.

My interest in this grew out of my interest in decentralized organizations. This enables a project to be an entity that assumes liability for its actions, and to reproduce itself.

Q: [me] Do you own this plantoid?

A: Hmm. I own the physical instantiation but not the code or the smart contract. If this one broke, I could make a new one that connects to the same smart contract. If someone gets hurt because it falls on the, I’m probably liable. If the smart contract is funding terrorism, I’m not the owner of that contract. The physical object is doing nothing but reacting to donations.

Q: But the aim of its reactions is to attract more money…

A: It will be up to the judge.

Q: What are the most likely senarios for the development of these weird objects?

A: A blockchain can provide the interface for humans interacting with each other without needing a legal entity, such as Uber, to centralize control. But you need people to decide to do this. The question is how these entities change the structure of the organization.

Be the first to comment »

April 27, 2018

[liveblog][ai] Ben Green: The Limits of "Fair" Algorithms

Ben Green is giving a ThursdAI talk on “The Limits, Perils, and Challenges of ‘Fair’ Algorithms for Criminal Justice Reform.”

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

In 2016, the COMPAS algorithm
became a household name (in some households) when ProPublica showed that it predicted that black men were twice as likely as white men to jump bail. People justifiably got worried that algorithms can be highly biased. At the same time, we think that algorithms may be smarter than humans, Ben says. These have been the poles of the discussion. Optimists think that we can limit the bias to take advantage of the added smartness.

There have been movements to go toward risk assessments for bail, rather than using money bail. E.g., Rand Paul and Kamala Harris have introduced the Pretrial Integrity and Safety Act of 2017. There have also been movements to use scores only to reduce risk assessments, not to increase them.

But are we asking the right questions? Yes, the criminal justice system would be better if judges could make more accurate and unbiased predictions, but it’s not clear that machine learning can do this. So, two questions: 1. Is ML an appropriate tool for this. 2. Is implementing MK algorithms an effective strategy for criminal justice reform?

#1 Is ML and appropriate tool to help judges make more accurate and unbiased predictions?

ML relies on data about the world. This can produce tunnel vision by causing us to focus on particular variables that we have quantified, and ignore others. E.g., when it comes to sentencing, a judge balances deterrence, rehabilitation, retribution, and incapacitating a criminal. COMPAS predicts recidivism, but none of the other factors. This emphasizes incapacitation as the goal of sentencing. This might be good or bad, but the ML has shifted the balance of factors, framing the decision without policy review or public discussion.

Q: Is this for sentencing or bail? Because incapacitation is a more important goal in sentencing than in bail.

A: This is about sentencing. I’ll be referring to both.

Data is always about the past, Ben continues. ML finds statistical correlations among inputs and outputs. It applies those correlations to the new inputs. This assumes that those correlations will hold in the future; it assumes that the future will look like the past. But if we’re trying reform the judicial system, we don’t want the future to look like the past. ML can thus entrench historical discrimination.

Arguments about the fairness of COMPAS are often based on competing mathematical definitions of fairness. But we could also think about the scope of what we couint as fair. ML tries to make a very specific decision: among a population, who recidivates? If you take a step back and consider the broader context of the data and the people, you would recognize that blacks recidivate at a higher rate than whites because of policing practices, economic factors, racism, etc. Without these considerations, you’re throwing away the context and accepting the current correlations as the ground truth. Even if we were to change the base data, the algorithm wouldn’t make the change, unless you retrain it.

Q: Who retrains the data?

A: It depends on the contract the court system has.

Algorithms are not themselves a natural outcome of the world. Subjective decisions go into making them: which data to input, choosing what to predict, etc. The algorithms are brought into court as if they were facts. Their subjectivity is out of the frame. A human expert would be subject to cross examination. We should be thinking of algorithms that way. Cross examination might include asking how accurate the system is for the particular group the defendant is in, etc.

Q: These tools are used in setting bail or a sentence, i.e., before or after a trial. There may not be a venue for cross examination.

In the Loomis case, an expert witness testified that the algorithm was misused. That’s not exactly what I’m suggesting; they couldn’t get to all of it because of the trade secrecy of the algorithms.

Back to the framing question. If you can make the individual decision points fair we sometimes think we’ve made the system fair. But technocratic solutions tend to sanitize rather than alter. You’re conceding the overall framework of the system, overlooking more meaningful changes. E.g., in NY, 71% of voters support ending pre-trial jail for misdemeanors and non-violent felonies. Maybe we should consider that. Or, consider that cutting food stamps has been shown to increases recidivism. Or perhaps we should be reconsidering the wisdom of preventative detention, which was only introduced in the 1980s. Focusing on the tech de-focuses on these sorts of reforms.

Also, technocratic reforms are subject to political capture. E.g., NJ replaced money bail with a risk assessment tool. After some of the people released committed crimes, they changed the tool so that certain crimes were removed from bail. What is an acceptable risk level? How to set the number? Once it’s set, how is it changed?

Q: [me] So, is your idea that these ML tools drive out meaningful change, so we ought not to use them?

A: Roughly, yes.

[Much interesting discussion which I have not captured. E.g., Algorithms can take away the political impetus to restore bail as simply a method to prevent flight. But sentencing software is different, and better algorithms might help, especially if the algorithms are recommending sentences but not imposing them. And much more.]

2. Do algorithms actually help?

How do judges use algorithms to make a decision? Even if the algorithm were perfect, would it improve the decisions judges make? We don’t have much of an empirical answer.

Ben was talking to Jeremy Heffner at Hunch Lab. They make predictive policing software and are well aware of the problem of bias. (“If theres any bias in the system it’s because of the crime data. That’s what we’re trying to address.” — Heffner) But all of the suggestions they give to police officers are called “missions,” which is in the military/jeopardy frame.

People are bad at incorporating quantitative data into decisions. And they filter info through their biases. E.g., the “ban the box” campaign to remove the tick box about criminal backgrounds on job applications actually increased racial discrimination because employers assumed the white applicants were less likely to have arrest records. (Agan and Starr 2016) Also, people have been shown to interpret police camera footage according to their own prior opinions about the police. (sommers 2016)

Evidence from Kentucky (Stevenson 2018): after mandatory risk assessments for bail only made a small increase in pretrial release, and these changes eroded over time as judges returned to their previous habits.

So, we need to be asking the empirical question of how judges actual use these decisions. And should judges incorporate these predictions into their decisions?

Ben’s been looking at the first question:L how do judges use algorithmic predictions? He’s running experiments on Mechanical Turk showing people profiles of defendants — a couple of sentences about the crime, race, previous record arrest record. The Turkers have to give a prediction of recidivism. Ben knows which ones actually recidivated. Some are also given a recommendation based on an algorithmic assessment. That risk score might be the actual one, random, or biased; the Turkers don’t know that about the score.

Q: It might be different if you gave this test to judges.

A: Yes, that’s a limitation.

Q: You ought to give some a percentage of something unrelated, e.g., it will rain, just to see if the number is anchoring people.

A: Good idea

Q: [me] Suppose you find that the Turkers’ assessment of risk is more racially biased than the algorithm…

A: Could be.

[More discussion until we ran out of time. Very interesting.]


April 2, 2018

"If a lion could talk" updated

“If a lion could talk, we could not understand him.”
— Ludwig Wittgenstein, Philosophical Investigations, 1953.

“If an algorithm could talk, we could not understand it.”
— Deep learning, Now.

1 Comment »

February 11, 2018

The story of lead and crime, told in tweets

Patrick Sharkey [twitter: patrick_sharkey] uses a Twitter thread to evaluate the evidence about a possible relationship between exposure to lead and crime. The thread is a bit hard to get unspooled correctly, but it’s worth it as an example of:

1. Thinking carefully about complex evidence and data.

2. How Twitter affects the reasoning and its expression.

3. The complexity of data, which will only get worse (= better) as machine learning can scale up their size and complexity.

Note: I lack the skills and knowledge to evaluate Patrick’s reasoning. And, hat tip to David Lazer for the retweet of the thread.

Comments Off on The story of lead and crime, told in tweets

The brain is not a computer and the world is not information

Robert Epstein argues in Aeon against the dominant assumption that the brain is a computer, that it processes information, stores and retrieves memories, etc. That we assume so comes from what I think of as the informationalizing of everything.

The strongest part of his argument is that computers operate on symbolic information, but brains do not. There is no evidence (that I know of, but I’m no expert. On anything) that the brain decomposes visual images into pixels and those pixels into on-offs in a code that represents colors.

In the second half, Epstein tries to prove that the brain isn’t a computer through some simple experiments, such as drawing a dollar bill from memory and while looking at it. Someone committed to the idea that the brain is a computer would probably just conclude that the brain just isn’t a very good computer. But judge for yourself. There’s more to it than I’m presenting here.

Back to Epstein’s first point…

It is of the essence of information that it is independent of its medium: you can encode it into voltage levels of transistors, magnetized dust on tape, or holes in punch cards, and it’s the same information. Therefore, a representation of a brain’s states in another medium should also be conscious. Epstein doesn’t make the following argument, but I will (and I believe I am cribbing it from someone else but I don’t remember who).

Because information is independent of its medium, we could encode it in dust particles swirling clockwise or counter-clockwise; clockwise is an on, and counter is an off. In fact, imagine there’s a dust cloud somewhere in the universe that has 86 billion motes, the number of neurons in the human brain. Imagine the direction of those motes exactly matches the on-offs of your neurons when you first spied the love of your life across the room. Imagine those spins shift but happen to match how your neural states shifted over the next ten seconds of your life. That dust cloud is thus perfectly representing the informational state of your brain as you fell in love. It is therefore experiencing your feelings and thinking your thoughts.

That by itself is absurd. But perhaps you say it is just hard to imagine. Ok, then let’s change it. Same dust cloud. Same spins. But this time we say that clockwise is an off, and the other is an on. Now that dust cloud no longer represents your brain states. It therefore is both experiencing your thoughts and feeling and is not experiencing them at the same time. Aristotle would tell us that that is logically impossible: a thing cannot simultaneously be something and its opposite.


Toward the end of the article, Epstein gets to a crucial point that I was very glad to see him bring up: Thinking is not a brain activity, but the activity of a body engaged in the world. (He cites Anthony Chemero’s Radical Embodied Cognitive Science (2009) which I have not read. I’d trace it back further to Andy Clark, David Chalmers, Eleanor Rosch, Heidegger…). Reducing it to a brain function, and further stripping the brain of its materiality to focus on its “processing” of “information” is reductive without being clarifying.

I came into this debate many years ago already made skeptical of the most recent claims about the causes of consciousness by having some awareness of the series of failed metaphors we have used over the past couple of thousands of years. Epstein puts this well, citing another book I have not read (and another book I’ve consequently just ordered):

In his book In Our Own Image (2015), the artificial intelligence expert George Zarkadakis describes six different metaphors people have employed over the past 2,000 years to try to explain human intelligence.

In the earliest one, eventually preserved in the Bible, humans were formed from clay or dirt, which an intelligent god then infused with its spirit. That spirit ‘explained’ our intelligence – grammatically, at least.

The invention of hydraulic engineering in the 3rd century BCE led to the popularity of a hydraulic model of human intelligence, the idea that the flow of different fluids in the body – the ‘humours’ – accounted for both our physical and mental functioning. The hydraulic metaphor persisted for more than 1,600 years, handicapping medical practice all the while.

By the 1500s, automata powered by springs and gears had been devised, eventually inspiring leading thinkers such as René Descartes to assert that humans are complex machines. In the 1600s, the British philosopher Thomas Hobbes suggested that thinking arose from small mechanical motions in the brain. By the 1700s, discoveries about electricity and chemistry led to new theories of human intelligence – again, largely metaphorical in nature. In the mid-1800s, inspired by recent advances in communications, the German physicist Hermann von Helmholtz compared the brain to a telegraph.

Maybe this time our tech-based metaphor has happened to get it right. But history says we should assume not. We should be very alert to the disanologies, which Epstein helps us with.

Getting this right, or at least not getting it wrong, matters. The most pressing problem with the informationalizing of thought is not that it applies a metaphor, or even that the metaphor is inapt. Rather it’s that this metaphor leads us to a seriously diminished understanding of what it means to be a living, caring creature.

I think.


Hat tip to @JenniferSertl for pointing out the Aeon article.

Comments Off on The brain is not a computer and the world is not information

December 17, 2017

[liveblog] Mariia Gavriushenko on personalized learning environments

I’m at the STEAM ed Finland conference in Jyväskylä where Mariia Gavriushenko is talking about personalized learning environments.

Web-based learning systems are being more and more widely used in large part because they can be used any time, anywhere. She points to two types: Learning management systems and game-based systems. But they lack personalization that makes them suitable for particular learners in terms of learning speed, knowledge background, preferences in learning and career, goals for future life, and their differing habits. Personalized systems can provide assistance in learning and adapt their learning path. Web-based learning shouldn’t just be more convenient. It should also be better adapted to personal needs.

But this is hard. But if you can do it, it can monitor the learner’s knowledge level and automatically present the right materials. In can help teachers create suitable material and find the most relevant content and convert it into comprehensive info. It can also help students identify the best courses and programs.

She talks about two types of personalized learning systems: 1. systems that allow the user to change the system or 2. the sysytem changes itself to meet the users needs. The systems can be based on rules and context or can be algorithm driven.

Five main features of adaptive learning systems:

  • Pre-test

  • Pacing and control

  • Feedback and assessment

  • Progress tracking and reports

  • Motivation and reward

The ontological presentation of every learner keeps something like a profile for each user, enabling semantic reasoning.

She gives an example of this model: automated academic advising. It’s based on learning analytics. It’s an intelligent learning support system based on semantically-enhanced decision support, that identifies gaps, and recommends materials and courses. It can create a personal study plan. The ontology helps the system understand which topics are connected to others so that it can identify knowledge gaps.

An adaptive vocabulary learning environment provides cildren with an adaptive way to train their vocabulary, taking into account the individuality of the learner. It assumes the more similar the words, the harder they are to recognize.

Mariia believes we will make increasing use of adaptive educational tech.

Comments Off on [liveblog] Mariia Gavriushenko on personalized learning environments

December 16, 2017

[liveblog] Mirka Saarela and Sanna Juutinen on analyzing education data

I’m at the STEAM ed Finland conference in Jyväskylä. Mirka Saarela and Sanna Juutinen are talking about their analysis of education data.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

There’s a triennial worldwide study by the OECD to assess students. Usually, people are only interested in its ranking of education by country. Finland does extremely well at this. This is surprising because Finland does not do particularly well in the factors that are taken to produce high quality educational systems. So Finnish ed has been studied extensively. PISA augments this analysis using learning analytics. (The US does at best average in the OECD ranking.)

Traditional research usually starts with the literature, develops a hypothesis, collects the data, and checks the result. PISA’s data mining approach starts with the data. “We want to find a needle in the haystack, but we don’t know what the needle looks like.” That is, they don’t know what type of pattern to look for.

Results of 2012 PISA: If you cluster all 24M students with their characteristics and attitudes without regard to their country you get clusters for Asia, developing world, Islamic, western countries. So, that maps well.

For Finland, the most salient factor seems to be its comprehensive school system that promotes equality and equity.

In 2015 for the first time there was a computerized test environment available. Most students used it. The logfile recorded how long students spent on a task and the number of activities (mouse clicks, etc.) as well as the score. They examined the Finnish log file to find student profiles, related to student’s strategies and knowledge. Their analysis found five different clusters. [I can’t read the slide from here. Sorry.] They are still studying what this tells us. (They purposefully have not yet factored in gender.)

Nov. 2017 results showed that girls did far better than boys. The test was done in a chat environment which might have been more familiar for the girls? Is the computerization of the tests affecting the results? Is the computerization of education affecting the results? More research is needed.


Q: Does the clustering suggest interventions? E.g., “Slow down. Less clicking.”

A: [I couldn’t quite hear the answer, but I think the answer is that it needs more analysis. I think.]

Q: I work for ETS. Are the slides available?

A: Yes, but the research isn’t public yet.

Comments Off on [liveblog] Mirka Saarela and Sanna Juutinen on analyzing education data

[liveblog] Harri Ketamo on micro-learning

I’m at the STEAM ed Finland conference in Jyväskylä. Harri Ketamo is giving a talk on “micro-learning.” He recently won a prestigious prize for the best new ideas in Finland. He is interested in the use of AI for learning.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

We don’t have enough good teachers globally, so we have to think about ed in new ways, Harri says. Can we use AI to bring good ed to everyone without hiring 200M new teachers globally? If we paid teachers equivalent to doctors and lawyers, we could hire those 200M. But we apparently not willing to do that.

One challenge: Career coaching. What do you want to study? Why? What are the skills you need? What do you need to know?

His company does natural language analysis — not word matches, but meaning. As an example he shows a shareholder agreement. Such agreements always have the same elements. After being trained on law, his company’s AI can create a map of the topic and analyze a block of text to see if it covers the legal requirements…the sort of work that a legal assistant does. For some standard agreements, we may soon not need lawyers, he predicts.

The system’s language model is a mess of words and relations. But if you zoom out from the map, the AI has clustered the concepts. At the Slush Sanghai conference, his AI could develop a list of the companies a customer might want to meet based on a text analysis of the companies’ web sites, etc. Likewise if your business is looking for help with a project.

Finland has a lot of public data about skills and openings. Universities’ curricula are publicly available.[Yay!] Unlike LinkedIn, all this data is public. Harri shows a map that displays the skills and competencies Finnish businesses want and the matching training offered by Finnish universities. The system can explore public information about a user and map that to available jobs and the training that is required and available for it. The available jobs are listed with relevancy expressed as a percentage. It can also look internationally to find matches.

The AI can also put together a course for a topic that a user needs. It can tell what the core concepts are by mining publications, courses, news, etc. The result is an interaction with a bot that talks with you in a Whatsapp like way. (See his paper “Agents and Analytics: A framework for educational data mining with games based learning”). It generates tests that show what a student needs to study if she gets a question wrong.

His newest project, in process: Libraries are the biggest collections of creative, educational material, so the AI ought to point people there. His software can find the common sources among courses and areas of study. It can discover the skills and competencies that materials can teach. This lets it cluster materials around degree programs. It can also generate micro-educational programs, curating a collection of readings.

His platform has an open an API. See Headai.


Q: Have you done controlled experiments?

A: Yes. We’ve found that people get 20-40% better performance when our software is used in blended model, i.e., with a human teacher. It helps motivate people if they can see the areas they need to work on disappear over time.

Q: The sw only found male authors in the example you put up of automatically collated materials.

A: Small training set. Gender is not part of the metadata in Finland.

A: Don’t you worry that your system will exacerbate bias?

Q: Humans are biased. AI is a black box. We need to think about how to manage this

Q: [me] Are the topics generated from the content? Or do you start off with an ontology?

A: It creates its ontology out of the data.

Q: [me] Are you committing to make sure that the results of your AI do not reflect the built in biases?

A: Our news system on the Web presents a range of views. We need to think about how to do this for gender issues with the course software.

Comments Off on [liveblog] Harri Ketamo on micro-learning

December 5, 2017

[liveblog] Conclusion of Workshop on Trustworthy Algorithmic Decision-Making

I’ve been at a two-day workshop sponsored by the Michigan State Uiversity and the National Science Foundation: “Workshop on Trustworthy Algorithmic Decision-Making.” After multiple rounds of rotating through workgroups iterating on five different questions, each group presented its findings — questions, insights, areas of future research.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Seriously, I cannot capture all of this.

Conduct of Data Science

What are the problems?

  • Who defines and how do we ensure good practice in data science and machine learning?

Why is the topic important? Because algorithms are important. And they have important real-world effects on people’s lives.

Why is the problem difficult?

  • Wrong incentives.

  • It can be difficult to generalize practices.

  • Best practices may be good for one goal but not another, e.g., efficiency but not social good. Also: Lack of shared concepts and vocabulary.

How to mitigate the problems?

  • Change incentives

  • Increase communication via vocabularies, translations

  • Education through MOOCS, meetups, professional organizations

  • Enable and encourage resource sharing: an open source lesson about bias, code sharing, data set sharing

Accountability group

The problem: How to integratively assess the impact of an algorithmic system on the public good? “Integrative” = the impact may be positive and negative and affect systems in complex ways. The impacts may be distributed differently across a population, so you have to think about disparities. These impacts may well change over time

We aim to encourage work that is:

  • Aspirationally casual: measuring outcomes causally but not always through randomized control trials.

  • The goal is not to shut down algorithms to to make positive contributions that generat solutions.

This is a difficult problem because:

  • Lack of variation in accountability, enforcements, and interventions.

  • It’s unclear what outcomes should be measure and how. This is context-dependent

  • It’s unclear which interventions are the highest priority

Why progress is possible: There’s a lot of good activity in this space. And it’s early in the topic so there’s an ability to significantly influence the field.

What are the barriers for success?

  • Incomplete understanding of contexts. So, think it in terms of socio-cultural approaches, and make it interdisciplinary.

  • The topic lies between disciplines. So, develop a common language.

  • High-level triangulation is difficult. Examine the issues at multiple scales, multiple levels of abstraction. Where you assess accountability may vary depending on what level/aspect you’re looking at.

Handling Uncertainty

The problem: How might we holistically treat and attribute uncertainty through data analysis and decisions systems. Uncertainty exists everywhere in these systems, so we need to consider how it moves through a system. This runs from choosing data sources to presenting results to decision-makers and people impacted by these results, and beyond that its incorporation into risk analysis and contingency planning. It’s always good to know where the uncertainty is coming from so you can address it.

Why difficult:

  • Uncertainty arises from many places

  • Recognizing and addressing uncertainties is a cyclical process

  • End users are bad at evaluating uncertain info and incorporating uncertainty in their thinking.

  • Many existing solutions are too computationally expensive to run on large data sets

Progress is possible:

  • We have sampling-based solutions that provide a framework.

  • Some app communities are recognizing that ignoring uncertainty is reducing the quality of their work

How to evaluate and recognize success?

  • A/B testing can show that decision making is better after incorporating uncertainty into analysis

  • Statistical/mathematical analysis

Barriers to success

  • Cognition: Train users.

  • It may be difficult to break this problem into small pieces and solve them individually

  • Gaps in theory: many of the problems cannot currently be solved algorithmically.

The presentation ends with a note: “In some cases, uncertainty is a useful tool.” E.g., it can make the system harder to game.

Adversaries, workarounds, and feedback loops

Adversarial examples: add a perturbation to a sample and it disrupts the classification. An adversary tries to find those perturbations to wreck your model. Sometimes this is used not to hack the system so much as to prevent the system from, for example, recognizing your face during a protest.

Feedback loops: A recidivism prediction system says you’re likely to commit further crimes, which sends you to prison, which increases the likelihood that you’ll commit further crimes.

What is the problem: How should a trustworthy algorithm account for adversaries, workarounds, and feedback loops?

Who are the stakeholders?

System designers, users, non-users, and perhaps adversaries.

Why is this a difficult problem?

  • It’s hard to define the boundaries of the system

  • From whose vantage point do we define adversarial behavior, workarounds, and feedback loops.

Unsolved problems

  • How do we reason about the incentives users and non-users have when interacting with systems in unintended ways.

  • How do we think about oversight and revision in algorithms with respect to feedback mechanisms

  • How do we monitor changes, assess anomalies, and implement safeguards?

  • How do we account for stakeholders while preserving rights?

How to recognize progress?

  • Mathematical model of how people use the system

  • Define goals

  • Find stable metrics and monitor them closely

  • Proximal metrics. Causality?

  • Establish methodologies and see them used

  • See a taxonomy of adversarial behavior used in practice

Likely approaches

  • Security methodology to anticipating and unintended behaviors and adversarial interactions’. Monitor and measure

  • Record and taxonomize adversarial behavior in different domains

  • Test . Try to break things.


  • Hard to anticipate unanticipated behavior

  • Hard to define the problem in particular cases.

  • Goodhardt’s Law

  • Systems are born brittle

  • What constitutes adversarial behavior vs. a workaround is subjective.

  • Dynamic problem

Algorithms and trust

How do you define and operationalize trust.

The problem: What are the processes through which different stakeholders come to trust an algorithm?

Multiple processes lead to trust.

  • Procedural vs. substantive trust: are you looking at the weights of the algorithms (e.g.), or what were the steps to get you there?

  • Social vs personal: did you see the algorithm at work, or are you relying on peers?

These pathways are not necessarily predictive of each other.

Stakeholders build truth through multiple lenses and priorities

  • the builders of the algorithms

  • the people who are affected

  • those who oversee the outcomes

Mini case study: a child services agency that does not want to be identified. [All of the following is 100% subject to my injection of errors.]

  • The agency uses a predictive algorithm. The stakeholders range from the children needing a family, to NYers as a whole. The agency knew what into the model. “We didn’t buy our algorithm from a black-box vendor.” They trusted the algorithm because they staffed a technical team who had credentials and had experience with ethics…and who they trusted intuitively as good people. Few of these are the quantitative metrics that devs spend their time on. Note that FAT (fairness, accountability, transparency) metrics were not what led to trust.


  • Processes that build trust happen over time.

  • Trust can change or maybe be repaired over time. “

  • The timescales to build social trust are outside the scope of traditional experiments,” although you can perhaps find natural experiments.


  • Assumption of reducibility or transfer from subcomponents

  • Access to internal stakeholders for interviews and process understanding

  • Some elements are very long term



What’s next for this workshop

We generated a lot of scribbles, post-it notes, flip charts, Slack conversations, slide decks, etc. They’re going to put together a whitepaper that goes through the major issues, organizing them, and tries to capture the complexity while helping to make sense of it.

There are weak or no incentives to set appropriate levels of trust

Key takeways:

  • Trust is irreducible to FAT metrics alone

  • Trust is built over time and should be defined in terms of the temporal process

  • Isolating the algorithm as an instantiation misses the socio-technical factors in trust.

Comments Off on [liveblog] Conclusion of Workshop on Trustworthy Algorithmic Decision-Making

December 4, 2017

Workshop: Trustworthy Algorithmic Decision-Making

I’m at a two-day inter-disciplinary workshop on “Trustworthy Algorithmic Decision-Making” put on by the National Science Foundation and Michigan State University. The 2-page whitepapers
from the participants are online. (Here’s mine.) I may do some live-blogging of the workshops.


– Key problems and critical qustionos?

– What to tell pol;icy-makers and others about the impact of these systems?

– Product approaches?

– What ideas, people, training, infrastructure are needed for these approaches?

Excellent diversity of backgrounds: CS, policy, law, library science, a philosopher, more. Good diversity in gender and race. As the least qualified person here, I’m greatly looking forward to the conversations.

Comments Off on Workshop: Trustworthy Algorithmic Decision-Making

Next Page »