February 8, 2011

[berkman] Brian Kernighan on numeracy

Brian Kernighan is giving a Berkman lunchtime talk called “Millions, Billions, Zillions: Why (In)numeracy matters.” Brian teaches at Princeton, but is at the Berkman Center this year, writing a book based on his undergrad course on what we need to know about computers. (Yes, Brian is that Brian K.) Brian teaches a course at Princeton on quantitative reasoning. He’s going to give us the “numeric self defense” portion of the course. [He assures us that none of us in this room need it, but speaking for myself I’m pretty sure he’s wrong.]

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

He gives as an example of quantitative reasoning a question about how long our oil reserves of 660 billion barrels (according to Newsweek) will last, based on how many vehicles there are and how big a barrel of oil is. He proposes a rough estimate. Figure there’s one car per person in the US: about 300M vehicles. And figure a barrel is in the range of a 55 gal drum. Call it 50 gals. He simplifies the numbers and does the math, and figures that we use 3 billion barrels/year. The Reserve contains 660 billion barrels, which means it will last over 200 years. Even if the numbers are off by a factor of 2 or 3, there seems to be enough to tell OPEC to take a hike. But, it turns out that Newsweek later corrected itself: The Reserve holds 660 million barrels, not billions. Oops. He then presents a scary list of major media being wrong about millions vs. billions vs. trillions. The problem is that those words signify to most people just an order of vastness, not actual quantities.

Cutting numbers down to size. The annual US budget deficit is $1.3B, according to the NYT, which comes down to $4/person. In fact, it’s $1.3 trillion, or $4,000/person. It gets worse when you’re dealing with computer terms: “a petaflop is a thousand trillion instruciios per second, not a million trillion” (NYT, 3/25/09) He gives another example of the WSJ getting a zettabyte calculation wrong by a couple of orders of magnitude. (Zettabye = 1o^21.) Had they been able to do some rough calculations, they would have seen that their comparison to the number of books in the Library of Congress would have result in there being 10,000 books in that collection.

Brian suggests numeric triage: “Is the number likely to be much too high, much too low, or plausible.” E.g., Dear Abby said that Americans receive almost 2 million tons of junk mail daily.” 2B tons = 4B pounds * 300 millions Americans = 13lbs/person/day. “Each person eats about 60 tons of food in a lifetime.” Yup: a ton a year is 5lbs/day. It’s plausible. Or, from another newspaper: You can save $88/day by turning off your computer when you’re not using it. Or, the London Times reported on a Nasa jet that travels 850 miles in 10 seconds. Newark Star Ledger: “The Passaic River was traveling about 200 miles per hour, about five times faster than average.” (They probably meant “per day.”)

Brian suggests having some benchmarks you can use for quick assessments. E.g., 6.8b people in Chinathe world, 300M in USA. A gallon of water weights about 8lbs. MP3 music is about 1 MB/minute. Light goes about a foot in a nanosecond; sound travels about 1000 feet in a second. There are 2000 working hours in a year.

It’s good to learn to estimate, Brian says. E.g., there’s a supposed Google interview question: How many golf balls can fit in a school bus? Brian asks his students: How many petabytes could you fit in this room. He shows a hard drive from a laptop. Figure out how many in a cubic foot, and how many cubic feet in a room.

“Every year, 1,000 Americans turn 50 years old,” says Gambling Magazine. How do you triage on it? Little’s Law relates how many items enter or leave a process and how long it will take to process them. [Not sure I got that.] There are 300M Americans. Each lives to 75. (Simplifying, of course). The arrival rate is: 4M born each year, and 4M die each year. That means 4M reach any given milestone each year, which means about 10,000 reach any milestone every day. So, 350,000 Americans turn 50 each month, as Forbes noted, and 4M students graduate from HS each year, as the NYT says. This sort of cross-article consistency is a very good sign.

Brian points to some errors to watch out for.

Errors of dimensionality: Young male bears roam 60-100 sq miles, while the females stay close to the cave, foraging within a 10-mile radius (says the Newark Star Ledger)…but that radius results in 314 sq miles.

Oddly precise numbers: “When a yacht is over 328 feet, it’s so big that you lose the intimacy” (The Yacht Report) This came from a conversion of 100m.

Brian strongly recommends Darrell Huff’s How to Lie with Statistics. Huff talks about (for example) graphs that clip the extent of graphs to magnify the differences. And one-dimensional graphs: E.g., a Starbucks growth chart that increases the width of the vertical bars, magnifying the growth.

Brian gives us one that Huff did not talk about. US News ranks colleges. Princeton usually gets the #1 slot for doctoral schools. Colleagues of Brian’s at AT&T looked at the “Places Rated” book that rates the livability of cities. His colleagues showed that by tinkering with the weighting of the categories, they could move 50% into first place, and 25% of the cities could be ranked first or last. These sorts of things combine flaky ratings with arbitrary weights.

Finally, when someone gives you a number, you should ask why they’re providing it. E.g., an ad in the Times said “Four thousand teens will try their first cigarette today.” It’s plausible. But, two weeks before, an ad said that 5,000 teenagers try pot for the first time every day. It’s hard to decide on the plausibility. That number was sponsored by a single issue advocacy group, who has an interest in making the number seem large. That should make us cautious. Or, Naomi Wolf (in The Beauty Myth) says that 150,000 American women die of anorexia every year. 2M women die a year, so that number seems way to high. (An audience member: 30,000 people die in car accidents every year. He knows people who have died in car accidents, but not of anorexia. Hence, the number is suspect.)

“The number of American children killed by guns has doubled every year since 1950” (Nany Day, Violence in schools.) (From Joel Best, Damned Lies and Statistics)


  • Recognise the enemy

  • Beware of the source

  • Learn some useful numbers, facts, shortcuts

  • Use your common sense and experience

Other sources: Charles Seife, Proofiness; John Alle Paulos, Innumercay


Q: How about the use of graphics?
A: Sure.

Q: About 40 years ago, I lectured in Italy, when the Lira was 1/1000 of a US dollar. A bank offered to convert it for 20%. They gave him a million dollars. A few days later they figured it out.

Q: [tom stites] Newspaper reporters tend to be pretty innumerate; they like words. But that’s been getting better. But there are fewer copy editors. And, when I worked at the Times, we insisted on graphs marking the discontinuity in the scale, but they don’t that any more.

People have been kvetching about this for decades, but I wonder if there’s any evidence that countries with higher math scores are less susceptible to fuzzy thinking?
A: I don’t have any data.

Q: I’ve been looking at the conversion to IPv6 where the number of addresses is many orders of magnitude higher. People worry that we’ll run out of them, but that underestimates the magnitude of the IPv6 addresses.
A: It’s about 79,000 trillion trillions more addresses.

Q: We’re getting more info, but it’s sloppier. Will people pay a premium for proper info? A magazine has instituted fact editors. Writers have to show two sources for word sources, and reputable sources for figures. It helped.
A: The economics might bring us there.

Q: It’s easy to lie with stats, its even easier to lie without them by sing vague words.
A: Proofiness is good on this.

Q: [me] We ought to standardize on some new cliched measurements: from “length of a football field” or “books in the Lib of Congress” to “number of atoms in the universe”

Q: We need checkers!
A: A Mechanical Turk for checking.

Q: As the numbers get larger, is there more of a breakdown in whether people can understand them?
A: is an interesting site, but at the end of the day it probably doesn’t help your intuition that much.

Q: Powers of Ten is great.

Q: The log scale of earthquakes is misleading to almost everyone.
A: Yes, and decibel scales, too.

Q: The same issues are probably happening with policy makers. We need an advanced class on this. Who would be taking it?
A: People take my class to satisfy a quantify reasoning requirement. There’s a class based on “Physics for Future Presidents.” It’d be great to have one on genomics, or psychology … lots of opportunities.

Q: JunkCharts is a good blog. It analyzes charts from around the Web.


February 7, 2011

Damn fine quicky lunch

Here’s a lunch I’m enjoying. I call it Fried Rice Omelet, because that’s what it is.

  • Take last night’s fried rice. (Surely you had fried rice last night!)

  • Spray a pan with some oil and heat it up.

  • Dump in enough rice to cover the pan. Heat it until it’s hot.

  • Cover the reheated fried rice with some egg beaters. (I suppose you could use real eggs if you scrambled them first.)

  • Cook for a minute or two. Before the eggs set, flip the rice over.

  • Serve with soy sauce, and Siracha if you want a little heat.

Serves: It depends how much you make.
Calories: Yeah, I guess.


[2b2k][misc] Choose your ski resort authority

Great Ski Holidays lets you search for a place you want to go skiing using a faceted system, so you can specify tags such as alpine, beginner, nightlife, and spa. (For my ideal ski resort, the tags would be: free, low, and indoors.) It seems well done, but the thing I really like about it is that you can choose which authorities you want to use: ski review sites, ski resorts & club sites, trade sites & tour operators, and (coming soon) reader reviews.

The site started out as a demo of “Authority Driven Facet Tags” by an enterprise search agency called Metaphor Search. It went so well that they opened it up to the Web public, although it still shows some signs of its demo origins, including some typos, etc. It just adds to the charm.

One of their blog posts actually credits Everything Is Miscellaneous as one of the inspirations, which makes me happy. The post says part of the impetus for developing a faceted system with configurable authorities was experiencing the difficulty of coming up with a single, uncontested geographical classification for the Maldives: Asia? Indian Ocean? And it got worse when they tried to come up with a taxonomy of destination types. So, rather than try to figure out what each user’s unexpressed taxonomy is, they decided to let the user decide which authorities to trust and use those authorities’ ways of divvying up the world. Clever, and not unlike the multi-taxonomy approach taken by some species-of-the-world sites.


February 6, 2011

Berkman Buzz

The latest Berkman Buzz, as compiled by Rebekah Heacock:

  • Yashomati Ghosh [twitter:yashomatig] describes bringing ICT to India’s citizens link

  • Dan Gillmor [twitter:dangillmor] reviews essential mobile phone apps for journalists (and you) link

  • Stuart Shieber [twitter:pmphlt] discusses the costs of open access link

  • Wendy Seltzer [twitter:wseltzer] explores the legality of the US government’s recent domain name seizures link

  • Weekly Global Voices [twitter:globalvoices] : “Gabon: The Invisible Revolt”:


The Eras of Late Night Memory Test

I’m fairly good at associating the U.S. presidents of my lifetime with the decades in which they were in office. But, I find myself unhinged in time when it comes to the late night talkshow hosts. I am constantly surprised upon hearing, say, how long Leno has been on.

You too? Let’s find out. Here’s a quiz. (All answers authenticated by the experts at Wikipedia.)

Year Steve Allen started The Tonight Show.

Year Jack Paar took over.

Start and end years of Johnny Carson’s hosting of The Tonight Show.

When did Carson move the show from NY to Hollywood?.

What year did the Tomorrow Show (which came on after the Tonight Show) start?

During what years did the Dick Cavett Show run on ABC as a late night show?

What year did Late Night with Letterman start?

Whom did Letterman replace? That is, who had been the host of the Tomorrow Show?

Who was host of The Tonight Show during most of the years that The Arsenio Hall Show was on?

Who was President during the year that Jay Leno first took over The Tonight Show?

When did Conan O’Brien take over Letterman’s Late Night?

What year did Jimmy Kimmel’s late night show begin?

Extra Credit

What road served as a bizarre euphemism for “penis,” expressing a ritualized fear of castration, on Carson’s Tonight Show ?

What object did Ed Ames accidentally turn into a surrogate penis, resulting in the longest laugh in Tonight Show history?

Do we sense a disturbingly Freudian pattern here?

Who played the non-endearing but frequent guest on the Tonight Show who went by the name “Aunt Blabby”?

What game show host had a late night talk show on a major network for a season?

What did Merv Griffin create that is probably known by the most people?

Name the funniest sidekick on any late night talk show?

Have you ever seen a complete episode of Jimmy Kimmel’s late night show?


February 4, 2011

Gladwell proves too much

Malcolm Gladwell is going further out on his cranky branch. His reading of the role of social media in Tunisia and Egypt actually seems to lead to conclusions that I think he would acknowledge are extreme and extremely unlikely. (I look at his new post in some detail after the big box below.)

Gladwell is in the unfortunate position of having published a New Yorker article dismissive of the effect of social media on social protest movements just weeks before the Tunisian and Egyptian revolts. Now Gladwell has posted a 200-word commentary that maintains his position without emendation. (Mathew Ingram has an excellent response to Gladwell’s latest post.)

I was among the many who replied to Gladwell’s initial article. I began that piece by trying to outline Gladwell’s argument, in a neutral and fair way. This is what I came up with:

In 1960, four college students staged a sit-in in NC. Within a week, sit-ins had started to spread like “a fever.”

Gladwell now states the claim he is going debunk: “The world, we are told, is in the midst of a revolution. The new tools of social media have reinvented social activism.” He then points to world events that have been claimed to support that view.

But, (he continues) those events were not really brought about by social media. Why would we think they were? It’s not due just to over-enthusiasm for social media. Fifty years after the civil rights movement, “we seem to have forgotten what activism is.” It is really our understanding of activism that is at issue.

Now, back to the sit-ins. They were dangerous. Civil rights activism took courage. That courage required strong ties to other activists. This was true not just of the civil rights movement in the US, but is a general characteristic of activism.

But, “The kind of activism associated with social media isn’t like this at all.” Social media (Twitter, Facebook) are all about weak ties. Weak ties are “in many ways a wonderful thing…But weak ties seldom lead to high-risk activism.” Social media activism works when little is asked of people.

Activism requires not just strong ties, but also strong, centralized, hierarchical organization. Not networks. You need a hierarchy “if you’re taking on a powerful and organized establishment…”

As an example, Gladwell ridicules the opening story in Clay Shirky’s Here Comes Everybody, about how “the crowd” got a smart phone returned to its rightful owner. “A networked, weak-tie world is good at things like helping Wall Streeters get phones back from teen-age girls.”

Now apply that to Tunisia and Egypt. You would think that these were pretty dramatic counter-examples. Gladwell does not think so. In fact, his recent post reads as if he’s exasperated that anyone is still bothering to disagree with him:

But surely the least interesting fact about them is that some of the protesters may (or may not) have at one point or another employed some of the tools of the new media to communicate with one another. Please. People protested and brought down governments before Facebook was invented.

Even the fact the post is only 200 words long gives the impression that the two Mideast upheavals are barely worth his time.

Let’s look at each of the post’s two paragraphs.

Paragraph #1. This is a paragraph of ridicule: Paying attention to social media is like hearing a famous revolutionary statement from Mao Zedong, paying scant attention to its content and import, and instead getting all excited because of the medium he used.

Yes, it is possible to pay too much attention to the medium as opposed to the message. But, as with so many arguments by ridicule, this one doesn’t advance our thought at all. We can counter by trying to make the analogy more exact: If in 1935 Mao had said “Power springs from the barrel of a gun,” and it had spread through, say, a new-fangled telephone tree so that it reached beyond the boundaries of government-controlled radio, and if that statement had signaled a turn to violent uprising, it would be irresponsible to ignore the role of the medium in the dissemination of the message. Or, if government printers had in the 1960s refused to publish the Little Red Book that spread that quote, the lack of a medium for it would surely be worth discussing. Media play an important role. When the medium is new, it is right to examine that role. That is not to say that the medium is a sufficient cause, or is the only thing worth discussing. But who has attributed the Tunisian and Egyptian uprisings solely to the existence of social media?

Gladwell’s argument in this first paragraph therefore seems to me to be: (1) Ultimately an argument against media having any role or significance in political movements; (2) An argument against a strawman; (3) Less an argument at all than a “Hey you kids, get off my lawn” statement of alignment.

Paragraph #2. Gladwell reiterates his point that political activism requires strong ties, and social media only provides weak ties. He defends these contentions by using the word “surely,” which almost always indicates that the speaker has no evidence to present that could in fact make us sure: “But surely the least interesting fact about them is that some of the protesters may (or may not) have at one point or another employed some of the tools of the new media to communicate with one another.”

It is not at all obvious that this is the least interesting fact. Social media are a new variable. Because history is so damn particular, contingent, and emergent, we can never be entirely sure which new variables matter. The anti-Mubarak demonstrations have been (apparently) heavily supported by Egypt’s trade unions, for example; perhaps that’s worth exploring. Declaring the possible role of social media the “least interesting fact” seems based either on an a priori belief that (a) media never have an important role in social movements, or (b) our new social media can have no role because of Gladwell’s theory that they can’t supply the strong ties necessary for activism. The first alternative seems too silly to defend. If it’s the second, then I would have thought a reasonable response from Gladwell would have been along these lines: “I’ve put forward a bold hypothesis about the ineffectiveness of social media. That hypothesis is based primarily on some historical examples. We have some new examples before us. Let us examine them to see if they indeed support my hypothesis — especially since so many have claimed that this new evidence refutes that hypothesis.” Instead we get all the power a confidently rendered “surely” can bring.

But the second paragraph is not over. Gladwell now gives examples of historical revolutions that succeeded before the development of the Net. The conclusion warranted from this evidence is that no particular medium is necessary for a revolution: We know you can have a revolution without, say, telephones because we’ve had many such revolutions. But this is a really bad way to argue about historical explanations. Many wars have ended without any atomic bombs being used, so we might as well say that historians ought not to consider the effect dropping a-bombs had on ending WWII. No, if we want to understand an event, we have to understand it within its history. The events in Tunisia and Egypt are occurring within a history in which social media are being used for among the first times. That makes the question of the role of social media interesting, and, under most theories of history — ones in which the nature of the contemporary media plays a contributing part — important.

Gladwell’s second paragraph therefore “proves” too much. But he backs off the obvious silliness of where his arguments lead by concluding: “People with a grievance will always find ways to communicate with each other. How they choose to do it is less interesting, in the end, than why they were driven to do it in the first place.” He thus proposes a sort of historical determinism: No matter what the means of communication, those who want a revolution will have a revolution. But: (1) How do we know this is true? (2) The means of communication may well affect (a) when it happens, (b) how it happens, (c) who participates, (d) its success, (e) how the world reacts, and (f) how the participants view themselves as a social group. That last point I acknowledge is the squishiest of them, but it may have the most lasting effect, helping to shape the governmental structure that emerges post-revolution: “We are a mob inspired by the incredible leaders who have the megaphones” might tend toward differences in governance than “We are a connected, empowered network.” In any case, it seems to me that investigating the role of social media is not an activity beneath contempt.

And that’s why I’ve written a post ten times longer than the one it’s discussing. Gladwell — with his amazing ability to illuminate difficult matters — is not merely splashing cold water on an overheated subject, but is trying to drown the subject entirely. Because we don’t yet understand the effect social media are having on social movements, it is unhelpful to have such a powerful voice ridiculing the effort to trace their effects. Gladwell’s attempt to undo unwarranted enthusiasm comes across instead as an argument for diminished nuance. That is exactly what Gladwell is decrying in our discourse, and is not what his body of writing has exemplified.

So, I come out of his brief post wondering how Gladwell would answer the following questions:

1. Does Gladwell believe that the means of communication never has any effect on any social protest movement? (“…in the French Revolution the crowd in the streets spoke to one another with that strange, today largely unknown instrument known as the human voice.”)

2. If he believes that the means of communication can have some effect, then does he believe that some media that do not create strong ties — radio, newspapers, tv, etc. — are worth considering when trying to understand social protest movements? If so, then why are networked social media not worth considering?

3. If social media are worth considering as playing some role in social protests, exactly what role and how important? A role so trivial that it is literally the least interesting factor historians and analysts should be looking at? Or is it of more importance than that, but just not anywhere near worth the amount of attention it’s been getting?

4. On what does he base these views? A theory about how social protest movements have worked and must work? Does he hold this theory as so obviously true that all events must now be interpreted within it, or is he willing to examine events to see if they support or contradict his theory?


February 3, 2011

Least likely Mythbusters result ever


February 2, 2011

Open access welcomes Nature

A couple of weeks ago, when Nature magazine announced it was starting a peer-reviewed open access journal, PLoS One (a peer reviewed open access journal) welcomed them the way Apple welcomed IBM into the personal computing market:

On January 6, 2011, Nature announced a new Open Access (OA) publication called Scientific Reports. Nature’s news underscores the growing acceptance of OA, as reflected in recent OA journal launches from other traditional publishers such as the BMJ, Sage,  AIP (American Institute of Physics) and APS (American Physical Society). Please spread the word either via this blog post or download this PDF.

Inspired by Apple

The Nature entry into the open access field is a big deal. So is Nature’s support of Creative Commons. I’ve had a chance to spend some little time with folks at Nature, and know them to be passionate about making the work of science more accessible. So, this is good news all around.


February 1, 2011

What crowdsourcing looks like

Watch volunteers jump into and around the Google spreadsheet that’s coordinating the transcribing and translating of Egyptian voice-to-tweet msgs. Not exactly a Jerry Bruckheimer video, but the awesomeness of what we’re seeing crept up on me. (Check the link to the hi-rez version after you’ve read the TheNextWeb post; otherwise you can’t really see what’s going on.)


