Joho the Blogtech Archives - Joho the Blog

October 29, 2017

Restoring photos’ dates from Google Photos download

Google Photos lets you download your photos, which is good since they’re you’re own damn photos. But when you do, every photo’s file will be stamped as having been created on the day you downloaded it. This is pretty much a disaster, especially since the photos have names like “IMG_20170619_153745.jpg.”

Ok, so maybe you noticed that the file name Google Photos supplies contains the date the photo was taken. So maybe you want to just munge the file name to make it more readable, as in “2017-06-19.” If you do it that way, you’ll be able to sort chronologically just by sorting alphabetically. But the files are still all going to be dated with the day you did the download, and that’s going to mean they won’t sort chronologically with any photos that don’t follow that exact naming convention.

So, you should adjust the file dates to reflect the day the photos were taken.

It turns out to be easy. JPG’s come with a header of info (called EXIF) that you can’t see but your computer can. There’s lots of metadata about your photo in that header, including the date it was taken. So, all you need to do is extract that date and re-set your file’s date to match it.

Fortunately, the good folks on the Net have done the heavy lifting for us.

Go to http://www.sentex.net/~mwandel/jhead/ and download the right version of jhead for your computer. Put it wherever you keep utilities. On my Mac I put it in /Applications/Utilities/, but it really doesn’t matter.

Open up a terminal. Log in as a superuser:

sudo -i

Enter the password you use to log into your computer and press Enter.

Change to the directory the contains the photos you want to update. You do this with the “cd” command, as in:

cd /Applications/Users/david/Downloads/GooglePhotos/

That’s a Mac-ish path. I’m going to assume you know enough about paths to figure out your own, how to handle spaces in directory names, etc. If not, my dear friend Google can probably help you.

You can confirm that you’ve successfully changed to the right directory by typing this into your terminal:

pwd

That will show you your current directory. Fix it if it’s wrong because the next command will change the file dates of jpgs in whatever directory you’re currently in.

Now for the brutal finishing move:

/Applications/Utilities/jpg-batch-file-jhead/jhead -ft *.jpg

Everything before the final forward slash is the path to wherever you put the jhead file. After that final slash the command is telling the terminal to run the jhead program, with a particular set of options (-ft) and to apply it to all the files in that directory that end with the extension “.jpg.”

That’s it.

If you want to run the program not just on the directory that you’re in but in all of its subdirectories, this post at StackExchange tells you how: https://photo.stackexchange.com/questions/27245/is-there-a-free-program-to-batch-change-photo-files-date-to-match-exif

Many thanks to Matthias Wandel for jhead and his other contributions to making life with bits better for us all.

Be the first to comment »

May 18, 2017

Indistinguishable from prejudice

“Any sufficiently advanced technology is indistinguishable from magic,” said Arthur C. Clarke famously.

It is also the case that any sufficiently advanced technology is indistinguishable from prejudice.

Especially if that technology is machine learning. ML creates algorithms to categorize stuff based upon data sets that we feed it. Say “These million messages are spam, and these million are not,” and ML will take a stab at figuring out what are the distinguishing characteristics of spam and not spam, perhaps assigning particular words particular weights as indicators, or finding relationships between particular IP addresses, times of day, lenghts of messages, etc.

Now complicate the data and the request, run this through an artificial neural network, and you have Deep Learning that will come up with models that may be beyond human understanding. Ask DL why it made a particular move in a game of Go or why it recommended increasing police patrols on the corner of Elm and Maple, and it may not be able to give an answer that human brains can comprehend.

We know from experience that machine learning can re-express human biases built into the data we feed it. Cathy O’Neill’s Weapons of Math Destruction contains plenty of evidence of this. We know it can happen not only inadvertently but subtly. With Deep Learning, we can be left entirely uncertain about whether and how this is happening. We can certainly adjust DL so that it gives fairer results when we can tell that it’s going astray, as when it only recommends white men for jobs or produces a freshman class with 1% African Americans. But when the results aren’t that measurable, we can be using results based on bias and not know it. For example, is anyone running the metrics on how many books by people of color Amazon recommends? And if we use DL to evaluate complex tax law changes, can we tell if it’s based on data that reflects racial prejudices?[1]

So this is not to say that we shouldn’t use machine learning or deep learning. That would remove hugely powerful tools. And of course we should and will do everything we can to keep our own prejudices from seeping into our machines’ algorithms. But it does mean that when we are dealing with literally inexplicable results, we may well not be able to tell if those results are based on biases.

In short: Any sufficiently advanced technology is indistinguishable from prejudice.[2]

[1] We may not care, if the result is a law that achieves the social goals we want, including equal and fair treatment of tax players regardless of race.

[2] Please note that that does not mean that advanced technology is prejudiced. We just may not be able to tell.

Comments Off on Indistinguishable from prejudice

May 15, 2017

[liveblog][AI] AI and education lightning talks

Sara Watson, a BKC affiliate and a technology critic, is moderating a discussion at the Berkman Klein/Media Lab AI Advance.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Karthik Dinakar at the Media Lab points out what we see in the night sky is in fact distorted by the way gravity bends light, which Einstein called a “gravity lens.” Same for AI: The distortion is often in the data itself. Karthik works on how to help researchers recognize that distortion. He gives an example of how to capture both cardiologist and patient lenses to better to diagnose women’s heart disease.

Chris Bavitz is the head of BKC’s Cyberlaw Clinic. To help Law students understand AI and tech, the Clinic encourages interdisciplinarity. They also help students think critically about the roles of the lawyer and the technologist. The clinic prefers early relationships among them, although thinking too hard about law early on can diminish innovation.

He points to two problems that represent two poles. First, IP and AI: running AI against protected data. Second, issues of fairness, rights, etc.

Leah Plunkett, is a professor at Univ. New Hampshire Law School and is a BKC affiliate. Her topic: How can we use AI to teach? She points out that if Tom Sawyer were real and alive today, he’d be arrested for what he does just in the first chapter. Yet we teach the book as a classic. We think we love a little mischief in our lives, but we apparently don’t like it in our kids. We kick them out of schools. E.g., of 49M students in public schools in 20-11, 3.45M were suspended, and 130,000 students were expelled. These disproportionately affect children from marginalized segments.

Get rid of the BS safety justification and the govt ought to be teaching all our children without exception. So, maybe have AI teach them?

Sarah: So, what can we do?

Chris: We’re thinking about how we can educate state attorneys general, for example.

Karthik: We are so far from getting users, experts, and machine learning folks together.

Leah: Some of it comes down to buy-in and translation across vocabularies and normative frameworks. It helps to build trust to make these translations better.

[I missed the QA from this point on.]

Comments Off on [liveblog][AI] AI and education lightning talks

[liveblog] AI Advance opening: Jonathan Zittrain and lightning talks

I’m at a day-long conference/meet-up put on by the Berkman Klein Center‘s and MIT Media Lab‘s “AI for the Common Good” project.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Jonathan Zittrain gives an opening talk. Since we’re meeting at Harvard Law, JZ begins by recalling the origins of what has been called “cyber law,” which has roots here. Back then, the lawyers got to the topic first, and thought that they could just think their way to policy. We are now at another signal moment as we are in a frenzy of building new tech. This time we want instead to involve more groups and think this through. [I am wildly paraphrasing.]

JZ asks: What is it that we intuitively love about human judgment, and are we willing to insist on human judgments that are worse than what a machine would come up with? Suppose for utilitarian reasons we can cede autonomy to our machines — e.g., autonomous cars — shouldn’t we? And what do we do about maintaining local norms? E.g., “You are now entering Texas where your autonomous car will not brake for pedestrians.”

“Should I insist on being misjudged by a human judge because that’s somehow artesinal?” when, ex hypothesis, an AI system might be fairer.

Autonomous systems are not entirely new. They’re bringing to the fore questions that have always been with us. E.g., we grant a sense of discrete intelligence to corporations. E.g., “McDonald’s is upset and may want to sue someone.”

[This is a particularly bad representation of JZ’s talk. Not only is it wildly incomplete, but it misses the through-line and JZ’s wit. Sorry.]

Lightning Talks

Finale Doshi-Velez is particularly interested in interpretable machine learning (ML) models. E.g., suppose you have ten different classifiers that give equally predictive results. Should you provide the most understandable, all of them…?

Why is interpretability so “in vogue”? Suppose non-interpretable AI can do something better? In most cases we don’t know what “better” means. E.g., someone might want to control her glucose level, but perhaps also to control her weight, or other outcomes? Human physicians can still see things that are not coded into the model, and that will be the case for a long time. Also, we want systems that are fair. This means we want interpretable AI systems.

How do we formalize these notions of interpretability? How do we do so for science and beyond? E.g., what is a legal “right to explanation
” mean? She is working with Sam Greshman on how to more formally ground AI interpretability in the cognitive science of explanation.

Vikash Mansinghka leads the eight-person Probabilistic Computing project at MIT. They want to build computing systems that can be our partners, not our replacements. We have assumed that the measure of success of AI is that it beats us at our own game, e.g., AlphaGo, Deep Blue, Watson playing Jeopardy! But games have clearly measurable winners.

His lab is working on augmented intelligence that gives partial solutions, guidelines and hints that help us solve problems that neither system could solve on their own. The need for these systems are most obvious in large-scale human interest projects, e.g., epidemiology, economics, etc. E.g., should a successful nutrition program in SE Asia be tested in Africa too? There are many variables (including cost). BayesDB, developed by his lab, is “augmented intelligence for public interest data science.”

Traditional computer science, computing systems are built up from circuits to algorithms. Engineers can trade off performance for interpretability. Probabilisitic systems have some of the same considerations. [Sorry, I didn’t get that last point. My fault!]

John Palfrey is a former Exec. Dir. of BKC, chair of the Knight Foundation (a funder of this project) and many other things. Where can we, BKC and the Media Lab, be most effective as a research organization? First, we’ve had the most success when we merge theory and practice. And building things. And communicating. Second, we have not yet defined the research question sufficiently. “We’re close to something that clearly relates to AI, ethics and government” but we don’t yet have the well-defined research questions.

The Knight Foundation thinks this area is a big deal. AI could be a tool for the public good, but it also might not be. “We’re queasy” about it, as well as excited.

Nadya Peek is at the Media Lab and has been researching “macines that make machines.” She points to the first computer-controlled machine (“Teaching Power Tools to Run Themselves“) where the aim was precision. People controlled these CCMs: programmers, CAD/CAM folks, etc. That’s still the case but it looks different. Now the old jobs are being done by far fewer people. But the spaces between doesn’t always work so well. E.g., Apple can define an automatiable workflow for milling components, but if you’re student doing a one-off project, it can be very difficult to get all the integrations right. The student doesn’t much care about a repeatable workflow.

Who has access to an Apple-like infrastructure? How can we make precision-based one-offs easier to create? (She teaches a course at MIT called “How to create a machine that can create almost anything.”)

Nathan Mathias, MIT grad student with a newly-minted Ph.D. (congrats, Nathan!), and BKC community member, is facilitating the discussion. He asks how we conceptualize the range of questions that these talks have raised. And, what are the tools we need to create? What are the social processes behind that? How can we communicate what we want to machines and understand what they “think” they’re doing? Who can do what, where that raises questions about literacy, policy, and legal issues? Finally, how can we get to the questions we need to ask, how to answer them, and how to organize people, institutions, and automated systems? Scholarly inquiry, organizing people socially and politically, creating policies, etc.? How do we get there? How can we build AI systems that are “generative” in JZ’s sense: systems that we can all contribute to on relatively equal terms and share them with others.

Nathan: Vikash, what do you do when people disagree?

Vikash: When you include the sources, you can provide probabilistic responses.

Finale: When a system can’t provide a single answer, it ought to provide multiple answers. We need humans to give systems clear values. AI things are not moral, ethical things. That’s us.

Vikash: We’ve made great strides in systems that can deal with what may or may not be true, but not in terms of preference.

Nathan: An audience member wants to know what we have to do to prevent AI from repeating human bias.

Nadya: We need to include the people affected in the conversations about these systems. There are assumptions about the independence of values that just aren’t true.

Nathan: How can people not close to these systems be heard?

JP: Ethan Zuckerman, can you respond?

Ethan: One of my colleagues, Joy Buolamwini, is working on what she calls the Algorithmic Justice League, looking at computer vision algorithms that don’t work on people of color. In part this is because the tests use to train cv systems are 70% white male faces. So she’s generating new sets of facial data that we can retest on. Overall, it’d be good to use test data that represents the real world, and to make sure a representation of humanity is working on these systems. So here’s my question: We find co-design works well: bringing in the affected populations to talk with the system designers?

[Damn, I missed Yochai Benkler‘s comment.]

Finale: We should also enable people to interrogate AI when the results seem questionable or unfair. We need to be thinking about the proccesses for resolving such questions.

Nadya: It’s never “people” in general who are affected. It’s always particular people with agendas, from places and institutions, etc.

Comments Off on [liveblog] AI Advance opening: Jonathan Zittrain and lightning talks

February 1, 2017

How to fix the WordFence wordfence-waf.php problem

My site has been down while I’ve tried to figure out (i.e., google someone else’s solution) to a crash caused by WordFence, an excellent utility that, ironically, protects your WordPress blog from various maladies.

The problem is severe: Users of your blog see naught but an error message of this form:

Fatal error: Unknown: Failed opening required ‘/home/dezi3014/public_html/wordfence-waf.php’ (include_path=’…/usr/lib/php /usr/local/lib/php’) in Unknown on line 0

The exact path will vary, but the meaning is the same. It is looking for a file that doesn’t exist. You’ll see the same message when you try to open your WordPress site as administrator. You’ll see it even when you manually uninstall WordPress by logging into your host and deleting the wordfence folder from the wp-content/plugins folder

If you look inside the wordfence-waf.php file (which is in whatever folder you’ve installed WordPress into), it warns you that “Before removing this file, please verify the PHP ini setting `auto_prepend_file` does not point to this.”

Helpful, except my php.ini file doesn’t have any reference to this. (I use MediaTemple.com as my host.) Some easy googling disclosed that the command to look for the file may not be in php.ini, but may be in .htaccess or .user.ini instead. And now you have to find those files.

At least for me, the .user.ini file is in the main folder into which you’ve installed WordPress. In fact, the only line in that file was the one that has the “auto_prepend_file” command. Remove that line and you have your site back.

I assume all of this is too obvious to write about for technically competent people. This post is for the rest of us.

4 Comments »

December 3, 2016

[liveblog] Stephanie Mendoza: Web VR

Stephanie Mendoza [twitter:@_liooil] [Github: SAM-liooil] is giving a talk at the Web 1.0 conference. She’s a Unity developer.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

WebVR— a 3D in-browser standard— is at 1.0 these days, she says.. It’s cross platform which is amazing because it’s hard to build for Web, Android, and Vive. It’s “uncharted territory” where “everything is an experiment.” You need Chromium
, an experimental version of Chrome, to run it. She uses A-Frame to create in-browser 3D environments.

“We’re trying to figure out the limit of things we can simulate.” It’s going to follow us out into the real world. E.g., she’s found that “Simulating fearful situations ) can lessen fear of those situations in the real world”simulating fearful situations (e.g., heights) can lessen fear of those situations in the real world.

This crosses into Meinong’s jungle: a repository of non-existent entities in Alexius Meinong‘s philosophy.

The tool they’re using is A-Frame, which is an abstraction layer on top of WebGL
, Three.js, and VRML. (VRML was an HTML standard that didn’t get taken up much because the browsers didn’t run it very well. [I was once on the board of a VRML company which also didn’t do very well.]) WebVR works on Vibe, High Fidelity, Janus, the Unity Web player, and Youtube 360, under different definitions of “works.” A-Frame is open source.

Now she takes us through how to build a VR Web page. You can scavenge for 3D assets or create your own. E.g., you can go to Thingiverse and convert the files to the appropriate format for A-Frame.

Then you begin a “scene” in A-Frame, which lives between <a-scene> tags in HTML. You can create graphic objects (spheres, planes, etc.) You can interactively work on the 3D elements within your browser. [This link will take you to a page that displays the 3D scene Stephanie is working with, but you need Chromium to get to the interactive menus.]

She goes a bit deeper into the A-Frame HTML for assets, light maps, height maps, specular maps, all of which are mapped back to much lower-count polygons. Entities consist of geometry, light, mesh, material, position, and raycaster, and your extensions. [I am not attempting to record the details, which Stephanie is spelling out clearly. ]

She talks about the HTC Vive. “The controllers are really cool. “They’re like claws. I use them to climb virtual trees and then jump out”They’re like claws. I use them to climb virtual trees and then jump out because it’s fun.” Your brain simulates gravity when there is none, she observes. She shows the A-Frame tags for configuring the controls, including gabbing, colliding, and teleporting.

She recommends some sites, including NormalMap, which maps images and lets you download the results.

QA

Q: Platforms are making their own non-interoperable VR frameworks, which is concerning.

A: It went from art to industry very quickly.

Comments Off on [liveblog] Stephanie Mendoza: Web VR

[liveblog] Paul Frazee on the Beaker Browser

At the Web 1.0 conference, Paul Frazee
is talking about a browser — a Chrome fork — he’s been writing to browse the distributed Web.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

The distributed Web is the Web with ideas from BitTorrent integrated into it. Beaker
uses IPFS and DAT

  • This means:

  1. Anyone can be a server at any time.

  2. There’s no binding between a specific computer and a site; the content lives independently.

  3. There’s no back end.

This lets Beaker provide some unusual features:

  1. A “fork button” is built into the browser itself so you can modify the site you’re browsing. “People can hack socially” by forking a site and sharing those changes.

  2. Independent publishing: The site owner can’t change your stuff. You can allocate new domains cheaply.

  3. With Beaker, you can write your site locally first, and then post into the distributed Web.

  4. Secure distribution

  5. Versioned URLs

He takes us through a demo. Beaker’s directory looks a bit like Github in terms of style. He shows how to create a new site using an integrated terminal tool. The init command creates a dat.json file with some core metadata. Then he creates an index.html file and publishes it. Then anyone using the browser can see the site and ask to see the files behind it…and fork them. As with GitHub, you can see the path of forks. If you own the site, you can write to the site, with the browser. [This fulfills Tim Berners’-Lee’s original vision of Web browsers.]

QA

Q: Any DNS support?

A: Yes.

Comments Off on [liveblog] Paul Frazee on the Beaker Browser

November 22, 2016

[liveblog][bkc] Scott Bradner: IANA: Important, but not for what they do"

I’m at a Berkman Klein [twitter: BKCHarvard] talk by Scott Bradner about IANA, the Internet Assigned Names Authority. Scott is one of the people responsible for giving us the Internet. So, thanks for that, Scott!

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Scott begins by pointing to the “absurdity” of Ted Cruz’s campaign
to prevent the “Internet giveaway.”“ The idea that “Obama gave away the Internet” is “hooey,”” The idea that “Obama gave away the Internet” is “hooey,” says Scott.

IANA started with a need to coordinate information, not to control it, he says. It began with the Network Working Group in 1968. Then Requests for Comments (RFC) in 1969. . The name “IANA” showed up in 1988, although the function had begun in 1972 with coordinating socket numbers. The Domain Name System made IP addresses easier to use, including the hierarchical clustering under .com, .org, etc.

Back to the beginning, computers were too expensive for every gov’t department to have one. So, ARPA wanted to share large and expensive computers among users. It created a packet-based network, which broke info up into packets that were then transmitted. Packet networking was the idea of Paul Baran at RAND who wanted a system that would survive a nuclear strike, but the aim of that network was to share computers. The packets had enough info to make it to their destinations, but the packet design made “no assumptions about the underlying transport network.” No service guarantees about packets making it through were offered. The Internet is the interconnection of the different networks, including the commercial networks that began showing up in the 1990s.

No one cared about the Net for decades. To the traditional telecom and corporate networking people, it was just a toy—”No quality of service, no guarantees, no security, no one in charge.” IBM thought you couldn’t build a network out of this because their definition of a network — the minimal requirements — was different. “That was great because it meant the regulators ignored us.”

The IANA function went into steady state 1984-1995. It did some allocating of addresses. (When Scott asked Jon Postel for addresses for Harvard, Postel sent him some; Postel was the one-person domain allocation shop.) IANA ran it for the top level domains.

“The Internet has few needs,” Scott says. It’s almost all done through collaboration and agreement. There are no requirements except at a very simple level. The only centralized functions: 1. We have to agree on what the protocol parameters are. Machines have to understand how to read the packet headers. 2. We have to allocate blocks of IP addresses and ASN‘s. 3. We have to have a single DNS, at least for now. IANA handles those three. “Everything else is distributed.” Everything else is collaboration.

In 1993, Network Solutions was given permission to start selling domain names. A domain cost $100 for 2 yrs. There were were about 100M names at that point, which added up to real money. Some countries even started selling off their TLD’s (top level domains), e.g., .tv

IANA dealt with three topics, but DNS was the only one of interest to most people. There was pressure to create new TLDs, which Scott thinks doesn’t solve any real problems. That power was given to ISOC, which set up the International Ad-Hoc Committee in 1996. It set up 7 new TLDs, one of which (.web) caused Image Online Design to sue Postel because they said Postel had promised it to them. The Dept. of Commerce saw that it needed to do something. So they put out an RFC and got 400+ comments. Meanwhile, Postel worked on a plan for institutionalizing the IANA function, which culminated in a conference in Jan 1998. Postel couldn’t go, so Scott presented in his stead.

Shortly after that the Dept of Commerce proposed having a private non-profit coordinate and manage the allocation of the blocks to the registries, manage the file that determines TLDs, and decide which TLDs should exist…the functions of IANA. “There’s no Internet governance here, simply what IANA did.”

There were meetings around the world to discuss this, including one sponsored by the Berkman Center. Many of the people attending were there to discuss Internet governance, which was not the point of the meetings. One person said, “Why are we wasting time talking about TLDs when the Internet is going to destroy countries?” “Most of us thought that was a well-needed vacuum,” says Scott. We didn’t need Internet governance. We were better off without it.

Jon Postel submitted a proposal for an Internet Corporation for Assigned Names and Numbers (ICANN). He died of a heart attack shortly thereafter. The Dept. of Commerce accepted the proposal. In Oct 1998 ICANN had its first board meeting. It was a closed meeting “which anticipated much of what’s wrong with ICANN.”

The Dept of Commerce had oversight over ICANN but its only power was to say yes or no to the file that lists the TLDs and the IP addresses of the nameservers for each of the TLDs.” “That’s the entirety of the control the US govt had over ICANN. “In theory, the Dept of Commerce could have said ‘Take Cuba out of that file,’ but that’s the most ridiculous thing they could have done and most of the world could have ignored them.” The Dept of Commerce never said no to ICANN.

ICANN institutionalizes the IANA. But it also has to deal with trademark issues coming out of domain name registrations, and consults on DNS security issues. “ICANN was formed as a little organization to replace Jon Postel.”

It didn’t stay little. ICANN’s budget went from a few million bucks to over $100M.“ “That’s a lot of money to replace a few competent geeks.”” “That’s a lot of money to replace a few competent geeks.” It’s also approved hundreds of TLDs. The bylaws went from 7,000 words to 37,000 words. “If you need 37,000 words to say what you’re doing, there’s something wrong.”

The world started to change. Many govts see the Net as an intrinsic threat.

  • In Sept. 2001, India, Brazil, and South Africa proposed that the UN undertake governance of the Internet.

  • Oct 2013: After Snowden, the Montevideo Statement on the Future of Internet Cooperation proposing moving away from US govt’s oversight of IANA.

  • Apr. 2014: NetMundial Initiative. “Self-appointed 25-member council to perform internet governance.”

  • Mar. 2014: NTIA announces its intent to transition key domain name functions.

The NTIA proposal was supposed to involve all the stakeholders. But it also said that ICANN should continue to maintain the openness of the Internet…a function that ICANN never had. Openness arises from the technical nature of the Net. NTIA said it wouldn’t accept an inter-governmental solution (like the ITU) because it has to involve all the stakeholders.

So who holds ICANN accountable? They created a community process that is “incredibly strong.” It can change the bylaws, and remove ICAN directors or the entire board.

Meanwhile, the US Congress got bent out of shape because the US is “giving away the Internet.” It blocked the NTIA from acting until Sept. 2016. On Oct. 1 IANA became independent and is under the control of the community. “This cannot be undone.” “If the transition had not happened, forces in the UN would likely have taken over” governance of the Internet. This would have been much more likely if the NTIA had not let it go. “The IANA performs coordination functions, not governance. There is no Internet governance.”

How can there be no governance? “Because nobody cared for long enough that it got away from them,” Scott says. “But is this a problem we have to fix?”

He leaves the answer hanging. [SPOILER: The answer is NO]

Q&A

Q: Under whom do the IRI‘s [Internationalized Resource Identifier] operate?

A: Some Europeans offered to take over European domain names from Jon Postel. It’s an open question whether they have authority to do what they’re doing Every one has its own policy development process.

Q: Where’s research being done to make a more distributed Internet?

A: There have been many proposals ever since ICANN was formed to have some sort of distributed maintenance of the TLDs. But it always comes down to you seeing the same .com site as I do — the same address pointing to the same site for all Internet users. You still have to centralize or at least distribute the mapping. Some people are looking at geographic addressing, although it doesn’t scale.

Q: Do you think Trump could make the US more like China in terms of the Internet?

A: Trump signed on to Cruz’s position on IANA. The security issue is a big one, very real. The gut reaction to recent DDOS
attacks is to fix that rather than to look at the root cause, which was crappy devices. The Chinese government controls the Net in China by making everyone go through a central, national connection. Most countries don’t do that. OTOH, England is imposing very strict content

rules that all ISPs have to obey. We may be moving to a telephony model, which is a Westphalian
idea of national Internets.

Q: The Net seems to need other things internationally controlled, e.g. buffer bloat. Peer pressure seems to be the only way: you throw people off who disagree.

A: IANA doesn’t have agreements with service providers. Buffer bloat is a real issue but it only affects the people who have it, unlike the IoT DDOS attack that affected us all. Are you going to kick off people who’s home security cameras are insecure?

Q: Russia seems to be taking the opposite approach. It has lots of connections coming into it, perhaps for fear that someone would cut them off. Terrorist groups are cutting cables, botnets, etc.

A: Great question. It’s not clear there’s an answer.

Q: With IPv6 there are many more address spaces to give out. How does that change things?

A: The DNS is an amazing success story. It scales extremely well … although there are scaling issues with the backbone routing systems, which are big and expensive. “That’s one of the issues we wanted to address when we did IPv6.”

Q: You said that ICANN has a spotty history of transparency. What role do you think ICANN is going to play going forward? Can it improve on its track record?

A: I’m not sure that it’s relevant. IANA’s functions are not a governance function. The only thing like a governance issue are the TLDs and ICANN has already blown that.

Comments Off on [liveblog][bkc] Scott Bradner: IANA: Important, but not for what they do"

June 25, 2016

TED, scraped

TED used to have an open API. TED no longer supports its open API. I want to do a little exploring of what the world looks like to TED, so I scraped the data from 2,228 TED Talk pages. This includes the title, author, tags, description, link to the transcript, number of times shared, and year. You can get it from here. (I named it tedTalksMetadata.txt, but it’s really a JSON file.)

“Scraping” means having a computer program look at the HTML underneath a Web page and try to figure out which elements refer to what. Scraping is always a chancy enterprise because the cues indicating which text is,say, the date and which is the title may be inconsistent across pages, and may be changed by the owners at any time. So I did the best I could, which is not very good. (Sometimes page owners aren’t happy about being scraped, but in this case it only meant one visit for each page, which is not a lot of burden for a site that has pages that get hundreds of thousands and sometimes millions of visits. If they really don’t want to be scraped, they could re-open their API, which provides far more reliable info far more efficiently.)

I’ve also posted at GitHub the php scripts I wrote to do the scraping. Please don’t laugh.

If you use the JSON to explore TED metadata, please let me know if you come up with anything interesting that you’re willing to share. Thanks!

Comments Off on TED, scraped

January 26, 2016

Oscars.JSON. Bad, bad JSON

Because I don’t actually enjoy watching football, during the Pats vs. Broncs game on Sunday I transposed the Oscar™ nominations into a JSON file. I did this very badly, but I did it. If you look at it, you’ll see just how badly I misunderstand JSON on some really basic levels.

But I posted it at GitHub™ where you can fix it if you care enough.

Why JSON™? Because it’s an easy format for inputting data into JavaScript™ or many other languages. It’s also human-readable, if you have a good brain for indents. (This is very different from having many indents in your brain, which is one reason I don’t particularly like to watch football™, even with the helmets and rules, etc.)

Anyway, JSON puts data into key:value™ pairs, and lets you nest them into sets. So, you might have a keyword such as “category” that would have values such as “Best Picture™” and “Supporting Actress™.” Within a category you might have a set of keywords such as “film_title” and “person” with the appropriate keywords.

JSON is such a popular way of packaging up data for transport over the Web™ that many (most? all?) major languages have built-in functions for transmuting it into data that the language can easily navigate.

So, why bother putting the Oscar™ nomination info into JSON? In case someone wants to write an app that uses that info. For example, if you wanted to create your own Oscar™ score sheet or, to be honest, entry tickets for your office pool, you could write a little script and output it exactly as you’d like. (Or you could just google™ for someone else’s Oscar™ pool sheet.) (I also posted a terrible little PHP script™ that does just that.)

So, a pointless™ exercise™ in truly terrible JSON design™. You’re welcome™!

Comments Off on Oscars.JSON. Bad, bad JSON

Next Page »