Joho the Blog » machine learning

September 18, 2023

Candidate simulator

http://chat2024.com is a candidate simulator that lets you chat with them to get their positions, in a good simulation of their style of speech. In a quick session this morning it seemed ok at that. It even responded appropriately when I challenged “Biden” about shipping cluster munitions to Ukraine.

It did an appropriate job when I chatted with “Tr*mp” about his economic record, faithfully repeating his lies in a very Tr*mpian cadence.

And when I asked TFG about how often he attends church, it bobbed and weaved appropriately, saying that how often he goes doesn’t matter. What matters is how he upholds Christian values, including redemption and forgiveness. When I asked “him” how his “I am your retribution” promise squares with that, “he” explained it as standing up for the American people in a battle against the bureaucrats, etc. Fine.

But when I took one step further down the conversational path and asked “him” where the Bible talks about redemption and forgiveness, “he” quoted Ephesians and The First Epistle of John. That is not in the least plausible for President Two Corinthians.

So, yes, Chat2024 is a useful tool for getting quick responses to straightforward questions about candidates’ positions, expressed in their way of speaking.

But, if you use it for what chat AI is designed for — chatting — it is capable of quickly falling into misleading answers, attributing to candidates not what they say or would say, but what their Large Language Model knows independent of the candidates.

That makes Chat2024 dangerous.

Follow me

Categories: ai, politics Tagged with: ai • chatai • culture • machine learning • politics Date: September 18th, 2023 dw

6 Comments »

December 4, 2022

Computers inside computers inside computers…

First there was the person who built a computer inside of Minecraft and programmed it to play Minecraft.

Now Frederic Besse built a usable linux terminal in GPTchat — usable in that it can perform systems operations on a virtual computer that’s also been invoked in (by? with?) GPTchat. For example, you can tell the terminal to create a file and where to store it in a file system that did not exist until you asked, and under most definitions of “exist” doesn’t exist anywhere.

I feel like I need to get a bigger mind in order for it to be sufficiently blown.

(PS: I could do without the casual anthropomorphizing in the GPT article.)

Follow me

Categories: ai, machine learning, philosophy Tagged with: ai • gpt • language models • machine learning • philosophy Date: December 4th, 2022 dw

Be the first to comment »

January 31, 2022

Meaning at the joints

Notes for a post:

Plato said (Phaedrus, 265e) that we should “carve nature at its joints,” which assumes of course that nature has joints, i.e., that it comes divided in natural and (for the Greeks) rational ways. (“Rational” here means something like in ways that we can discover, and that divide up the things neatly, without overlap.)

For Aristotle, at least in the natural world those joints consist of the categories that make a thing what it is, and that make things knowable as those things.

To know a thing was to see how it’s different from other things, particularly (as per Aristotle) from other things that they share important similarities with: humans are the rational animals because we share essential properties with other animals, but are different from them in our rationality.

The overall order of the universe was knowable and formed a hierarchy (e.g. beings -> animals -> vertebrates -> upright -> rational) that makes the differences essential. It’s also quite efficient since anything clustered under a concept, no matter how many levels down, inherits the properties of the higher level concepts.

We no longer believe that there is a perfect, economical order of things. “We no longer believe that there is a single, perfect, economical order of things. ”We want to be able to categorize under many categories, to draw as many similarities and differences as we need for our current project. We see this in our general preference for search over browsing through hierarchies, the continued use of tags as a way of cutting across categories, and in the rise of knowledge graphs and high-dimensional language models that connect everything every way they can even if the connections are very weak.

Why do we care about weak connections? 1. Because they are still connections. 2. The Internet’s economy of abundance has disinclined us to throw out any information. 3. Our new technologies (esp. machine learning) can make hay (and sometimes errors) out of rich combinations of connections including those that are weak.

If Plato believed that to understand the world we need to divide it properly — carve it at its joints — knowledge graphs and machine learning assume that knowledge consists of joining things as many different ways as we can.

Follow me

Categories: abundance, big data, everyday chaos, everythingIsMiscellaneous, machine learning, philosophy, taxonomy, too big to know Tagged with: ai • categories • everythingIsMiscellaneous • machine learning • meaning • miscellaneous • philosophy • taxonomies Date: January 31st, 2022 dw

3 Comments »

November 15, 2021

Dust Rising: Machine learning and the ontology of the real

Aeon.co has posted an article I worked on for a couple of years. It’s only 2,200 words, but they were hard words to find because the ideas were, and are, hard for me. I have little sense of whether I got either the words or the ideas right.

The article argues, roughly, that the sorts of generalizations that machine learning models embody are very different from the sort of generalizations the West has taken as the truths that matter. ML’s generalizations often are tied to far more specific configurations of data and thus are often not understandable by us, and often cannot be applied to particular cases except by running the ML model.

This may be leading us to locate the really real not in the eternal (as the West has traditional done) but at least as much in the fleeting patterns of dust that result from everything affecting everything else all the time and everywhere.

Three notes:

Nigel Warburton, the philosophy editor at Aeon, was very helpful, as was Timo Hannay in talking through the ideas, and at about a dozen other people who read drafts. None of them agreed entirely with the article.

2. Aeon for some reason deleted a crucial footnote that said that my views do not necessarily represent the views of Google, while keeping the fact that I am a part time, temporary writer-in-residence there. To be clear: My reviews do not necessarily represent Google’s.

3. My original first title for it was “Dust Rising”, but then it became “Trains, Car Wrecks, and Machine Learning’s Ontology” which i still like although I admit it that “ontology” may not be as big a draw as I think it is.

Follow me

Categories: ai, machine learning, philosophy Tagged with: ai • everydaychaos • machine learning • philosophy Date: November 15th, 2021 dw

Be the first to comment »

February 28, 2021

The Uncanny Stepford Valley

You’ve probably heard about MyHeritage.com‘s DeepNostalgia service that animates photos of faces. I’ve just posted at Psychology Today about the new type of uncanniness it induces, even though the animations of the individual photos I think pretty well escape The uncanny Value.

Here’s a sample from the MyHeritage site:

And here’s a thread of artworks and famous photos animated using DeepNostalgia that I reference in my post:

https://t.co/MDFSu3J0H1 has created some sort of animate your old photos application and I’m of course using it to feed my history addiction.
I apologise in advance to all the ancestors I’m about to offend.

Very fake history.

I’m sorry Queenie. pic.twitter.com/2np437yXyt
— Fake History Hunter (@fakehistoryhunt) February 28, 2021

More at Psychology Today …

Follow me

Categories: ai, culture, machine learning, philosophy Tagged with: ai • entertainment • machine learning • philosophish • uncanny valley Date: February 28th, 2021 dw

Be the first to comment »

July 23, 2020

Getting beneath the usual machine learning metaphors: A new podcast

Google has just launched a new podcast that Yannick Assogba (twitter: @tafsiri) and I put together. Yannick is a software engineer at Google PAIR where I was a writer-in-residence for two years, until mid-June. I am notably not a software engineer. Throughout the course of the nine episodes, Yannick helps me train machine learning models to play Tic Tac Toe and then a much more complex version of it. Then our models fight! (Guess who wins? Never mind.)

This is definitely not a tutorial. We’re focused on getting beneath the metaphors we usually use when talking about machine learning. In so doing, we keep coming back to the many human decisions that have to be made along the way.

So the podcast is for anyone who wants to get a more vivid sense of how ML works and the ways in which human intentions and assumptions shape each and every ML application. The podcast doesn’t require any math or programming skills.

It’s chatty and fun, and full of me getting simple things wrong. And Yannick is a fantastic teacher. I miss seeing him every day :(

All nine episodes are up now. They’re about 25 mins each. You can find them wherever you get your podcasts, so long as it carries ours.

Podcast: https://pair.withgoogle.com/thehardway/

Two-minute teaser: https://share.transistor.fm/s/6768a641

Follow me

Categories: misc Tagged with: ai • everydaychaos • machine learning • podcast Date: July 23rd, 2020 dw

4 Comments »

January 28, 2020

Games without strategies

Digital Extremes wants to break the trend of live-service games meticulously planning years of content ahead of time using road maps…’What happens then is you don’t have a surprise and you don’t have a world that feels alive,’ [community director Rebecca] Ford says. ‘You have a product that feels like a result of an investor’s meeting 12 months ago.'”
— Steven Messner, “This Means War,” PC Gamer, Feb. 2020, p. 34

Video games have been leading indicators for almost forty years. It was back in the early 1980s that games started welcoming modders who altered the visuals, turning Castle Wolfenstein into Castle Smurfenstein, adding maps, levels, cars, weapons, and rules to game after game. Thus the games became more replayable. Thus the games became whatever users wanted to make them. Thus games — the most rule-bound of activities outside of a law court or a tea ceremony — became purposefully unpredictable.

Rebecca Ford is talking about Warframe, but what she says about planning and road maps points the way for what’s happening with business strategies overall. The Internet has not only gotten us used to an environment that is overwhelming and unpredictable, but we’ve developed approaches that let us leverage that unpredictability, from open platforms to minimum viable products to agile development.

The advantage of strategy is that it enables an organization to focus its attention and resources on a single goal. The disadvantages are that strategic planning assumes that the playing field is relatively stable, and that change general happens according to rules that we can know and apply. But that stability is a dream. Now that we have tech that lets us leverage unpredictability, we are coming to once again recognize that strategies work almost literally by squinting our eyes so tight that they’re almost closed.

Maybe games will help us open our eyes so that we do less strategizing and more playing.

Follow me

Categories: business, everyday chaos, games Tagged with: everydaychaos • future • games • internet • machine learning • strategy Date: January 28th, 2020 dw

Be the first to comment »

October 29, 2019

Late, breaking Android app news: transcription

Note that this is not late-breaking news. It’s breaking news brought to you late: Android contains a really great Google transcription tool.

Live Transcribe transcribes spoken text in real time. So far, it seems pretty awesome at it. And its machine learning model is loaded on your device, so it works even when you’re offline — convenient and potentially less intrusive privacy-wise. (Only potentially, because Google could upload your text when you connect if it wanted to.)

You can download Live Transcribe from the Play Store, but if you’re like me, it will only give you an option to uninstall it. Oddly, it doesn’t show up in my App drawer. You have to go to your phone’s Settings > Accessibility screen and scroll all the way down to find the Live Transcribe option.

Once you turn it on, you’ll get an icon all the way at the bottom of your screen, to the right of the Home button. Weird that it’s given that much status, but there it is.

I expect I will be using this tool with surprising frequency … although if I expect it, it won’t be surprising.

Follow me

Categories: tech Tagged with: ai • apps • machine learning • transcription • utilities Date: October 29th, 2019 dw

Be the first to comment »

October 6, 2019

Making the Web kid-readable

Of the 4.67 gazillion pages on the Web, exactly 1.87 nano-bazillion are understandable by children. Suppose there were a convention and a service for making child-friendly versions of any site that wanted to increase its presence and value?

That was the basic idea behind our project at the MindCET Hackathon in the Desert a couple of weeks ago.

MindCET is an ed tech incubator created by the Center for Educational Technology (CET) in Israel. “Automatically generates grade-specific versions? Hahaha.”Its founder and leader is Avi Warshavsky, a brilliant technologist and a person of great warmth and character, devoted to improving education for all the world’s children. Over the ten years that I’ve been on the CET tech advisory board, Avi has become a treasured personal friend.

In Yeruham on the edge of the Negev, 14 teams of 6-8 people did the hackathon thing. Our team — to my shame, I don’t have a list of them — pretty quickly settled on thinking about what it would take to create a world-wide expectation that sites that explain things would have versions suitable for children at various grade levels.

So, here’s our plan for Onderstand.com.

Let’s say you have a site that provides information about some topic; our example was a page about how planes fly. It’s written at a normal adult level, or perhaps it assumes even more expertise about the topic. You would like the page to be accessible to kids in grade school.

No problem! Just go to Onderstand.com and enter the page’s URL. Up pops a form that lets you press a button to automatically generate versions for your choice of grade levels. Or you can create your own versions manually. The form also lets you enter useful metadata, including what school kid questions you think your site addresses, such as “How do planes fly?”, “What keeps planes up?”, and “Why don’t planes crash?” (And because everything is miscellaneous, you also enter tags, of course.)

Before I go any further, let me address your question: “It automatically generates grade-specific versions? Hahaha.” Yes, it’s true that in the 36 hours of the hackathon, we did not fully train the requisite machine learning model, in the sense that we didn’t even try. But let’s come back to that…

Ok, so imagine that you now have three grade-specific versions of your page about how planes fly. You put them on your site and give Onderstand their Web addresses as well as the metadata you’ve filled in. (Perhaps Onderstand.com would also host or archive the pages. We did not work out all these details.)

Onderstand generates a button you can place on your site that lets the visitor know that there are kid-ready versions.

The fact that there are those versions available is also recorded by Onderstand.com so that kids know that if they have a question, they can search Onderstand for appropriate versions.

Our business model is the classic “We’re doing something of value so someone will pay for it somehow.” Of course, we guarantee that we will never sell, rent, publish, share or monetize user information. But one positive thing about this approach: The service does not become valuable only once there’s lots of content. “Because sites get the kid-ready button, they get value from it”Because sites get the kid-ready button, they get value from it even if the Onderstand.com site attracts no visitors.

If the idea were to take off, then a convention that it establishes would be useful even if Onderstand were to fold up like a cheap table. The convention would be something like Wikipedia’s prepending “simple” before an article address. For example, the Wikipedia article “Airplane” is a great example of the problem: It is full of details but light on generalizations, uses hyperlinks as an excuse for lazily relying on jargon rather than readable text, and never actually explains how a plane flies. But if you prepend “simple” to that page’s URL — https://simple.wikipedia.org/wiki/Fixed-wing_aircraft — you get taken to a much shorter page with far fewer details (but also still no explanation of how planes fly).

Now, our hackathon group did not actually come up with what those prepensions should be. Maybe “grade3”, “grade9”, etc. But we wouldn’t want kids to have to guess which grade levels the site has available. So maybe just “school” or some such which would then pop up a list of the available versions. What I’m trying to say is that that’s the only detail left before we transform the Web.

The machine learning miracle

Machine learning might be able to provide a fairly straightforward, and often unsatisfactory, way of generating grade-specific versions.

“The ML could be trained on a corpus of text that has human-generated versions for kids.”The ML could be trained on a corpus of text that has human-generated versions for kids. The “simple” Wikipedia pages and their adult equivalents could be one source. Textbooks on the same subjects designed for different class levels might be another, even though — unlike the Wikipedia “simple” pages — they are not more or less translations of the same text. There are several experimental simplification applications discussed on the Web already.

Even if this worked, it’s likely to be sub-par because it would just be simplifying language, not generating explanations that creatively think in kids’ terms. For example, to explain flight to a high schooler, you would probably want to explain the Bernoulli effect and the four forces that act on a wing, but for a middle schooler you might start with the experiment in which they blow across a strip of paper, and for a grade schooler you might want to ask if they’ve ever blown on the bottom of a bubble.

So, even if the ML works, the site owner might want to do something more creative and effective. But still, simply having reduced-vocabulary versions could be helpful, and might set an expectation that a site isn’t truly accessible if it isn’t understandable.

Ok, so who’s in on the angel funding round?

Follow me

Categories: misc Tagged with: ai • education • hackathon • machine learning Date: October 6th, 2019 dw

Be the first to comment »

July 10, 2019

Learning AI by doing: My new series of posts

The first in a series of six posts about my experiences learning how to train a machine learning system has just been posted here. There’s no code and no math in it. Instead it focuses on the tasks and choices involved in building one of these applications. How do you figure out what sort of data to provide? How do you get that data into the system? How can you tell when the system has been trained? What types of controls do the developers have over the outcomes? What sort of ways can I go wrong? (Given that the title of the series is “The Adventures of a TensorFlow.js n00b” the answer to that last question is: Every way.)

I was guided through this project by Yannick Assogba, a developer in the machine learning research group — People + AI Research –I’m embedded in at Google as a writer in residence. Yannick is natural born teacher, and is preternaturally patient.

The series is quite frank. I make every stupid mistake possible. And for your Schadenfreude, five more posts in this series are on their way…

Follow me

Categories: ai, tech Tagged with: ai • machine learning • PAIR Date: July 10th, 2019 dw

Be the first to comment »