Joho the Blog » [berkman] Brian Kernighan on numeracy

[berkman] Brian Kernighan on numeracy

Brian Kernighan is giving a Berkman lunchtime talk called “Millions, Billions, Zillions: Why (In)numeracy matters.” Brian teaches at Princeton, but is at the Berkman Center this year, writing a book based on his undergrad course on what we need to know about computers. (Yes, Brian is that Brian K.) Brian teaches a course at Princeton on quantitative reasoning. He’s going to give us the “numeric self defense” portion of the course. [He assures us that none of us in this room need it, but speaking for myself I'm pretty sure he's wrong.]

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

He gives as an example of quantitative reasoning a question about how long our oil reserves of 660 billion barrels (according to Newsweek) will last, based on how many vehicles there are and how big a barrel of oil is. He proposes a rough estimate. Figure there’s one car per person in the US: about 300M vehicles. And figure a barrel is in the range of a 55 gal drum. Call it 50 gals. He simplifies the numbers and does the math, and figures that we use 3 billion barrels/year. The Reserve contains 660 billion barrels, which means it will last over 200 years. Even if the numbers are off by a factor of 2 or 3, there seems to be enough to tell OPEC to take a hike. But, it turns out that Newsweek later corrected itself: The Reserve holds 660 million barrels, not billions. Oops. He then presents a scary list of major media being wrong about millions vs. billions vs. trillions. The problem is that those words signify to most people just an order of vastness, not actual quantities.

Cutting numbers down to size. The annual US budget deficit is $1.3B, according to the NYT, which comes down to $4/person. In fact, it’s $1.3 trillion, or $4,000/person. It gets worse when you’re dealing with computer terms: “a petaflop is a thousand trillion instruciios per second, not a million trillion” (NYT, 3/25/09) He gives another example of the WSJ getting a zettabyte calculation wrong by a couple of orders of magnitude. (Zettabye = 1o^21.) Had they been able to do some rough calculations, they would have seen that their comparison to the number of books in the Library of Congress would have result in there being 10,000 books in that collection.

Brian suggests numeric triage: “Is the number likely to be much too high, much too low, or plausible.” E.g., Dear Abby said that Americans receive almost 2 million tons of junk mail daily.” 2B tons = 4B pounds * 300 millions Americans = 13lbs/person/day. “Each person eats about 60 tons of food in a lifetime.” Yup: a ton a year is 5lbs/day. It’s plausible. Or, from another newspaper: You can save $88/day by turning off your computer when you’re not using it. Or, the London Times reported on a Nasa jet that travels 850 miles in 10 seconds. Newark Star Ledger: “The Passaic River was traveling about 200 miles per hour, about five times faster than average.” (They probably meant “per day.”)

Brian suggests having some benchmarks you can use for quick assessments. E.g., 6.8b people in Chinathe world, 300M in USA. A gallon of water weights about 8lbs. MP3 music is about 1 MB/minute. Light goes about a foot in a nanosecond; sound travels about 1000 feet in a second. There are 2000 working hours in a year.

It’s good to learn to estimate, Brian says. E.g., there’s a supposed Google interview question: How many golf balls can fit in a school bus? Brian asks his students: How many petabytes could you fit in this room. He shows a hard drive from a laptop. Figure out how many in a cubic foot, and how many cubic feet in a room.

“Every year, 1,000 Americans turn 50 years old,” says Gambling Magazine. How do you triage on it? Little’s Law relates how many items enter or leave a process and how long it will take to process them. [Not sure I got that.] There are 300M Americans. Each lives to 75. (Simplifying, of course). The arrival rate is: 4M born each year, and 4M die each year. That means 4M reach any given milestone each year, which means about 10,000 reach any milestone every day. So, 350,000 Americans turn 50 each month, as Forbes noted, and 4M students graduate from HS each year, as the NYT says. This sort of cross-article consistency is a very good sign.

Brian points to some errors to watch out for.

Errors of dimensionality: Young male bears roam 60-100 sq miles, while the females stay close to the cave, foraging within a 10-mile radius (says the Newark Star Ledger)…but that radius results in 314 sq miles.

Oddly precise numbers: “When a yacht is over 328 feet, it’s so big that you lose the intimacy” (The Yacht Report) This came from a conversion of 100m.

Brian strongly recommends Darrell Huff’s How to Lie with Statistics. Huff talks about (for example) graphs that clip the extent of graphs to magnify the differences. And one-dimensional graphs: E.g., a Starbucks growth chart that increases the width of the vertical bars, magnifying the growth.

Brian gives us one that Huff did not talk about. US News ranks colleges. Princeton usually gets the #1 slot for doctoral schools. Colleagues of Brian’s at AT&T looked at the “Places Rated” book that rates the livability of cities. His colleagues showed that by tinkering with the weighting of the categories, they could move 50% into first place, and 25% of the cities could be ranked first or last. These sorts of things combine flaky ratings with arbitrary weights.

Finally, when someone gives you a number, you should ask why they’re providing it. E.g., an ad in the Times said “Four thousand teens will try their first cigarette today.” It’s plausible. But, two weeks before, an ad said that 5,000 teenagers try pot for the first time every day. It’s hard to decide on the plausibility. That number was sponsored by a single issue advocacy group, who has an interest in making the number seem large. That should make us cautious. Or, Naomi Wolf (in The Beauty Myth) says that 150,000 American women die of anorexia every year. 2M women die a year, so that number seems way to high. (An audience member: 30,000 people die in car accidents every year. He knows people who have died in car accidents, but not of anorexia. Hence, the number is suspect.)

“The number of American children killed by guns has doubled every year since 1950″ (Nany Day, Violence in schools.) (From Joel Best, Damned Lies and Statistics)

Defenses:

  • Recognise the enemy

  • Beware of the source

  • Learn some useful numbers, facts, shortcuts

  • Use your common sense and experience

Other sources: Charles Seife, Proofiness; John Alle Paulos, Innumercay

Sites:
innumeracy.com
megapenny.com
math.temple.edu/~paulos/

Q: How about the use of graphics?
A: Sure.

Q: About 40 years ago, I lectured in Italy, when the Lira was 1/1000 of a US dollar. A bank offered to convert it for 20%. They gave him a million dollars. A few days later they figured it out.

Q: [tom stites] Newspaper reporters tend to be pretty innumerate; they like words. But that’s been getting better. But there are fewer copy editors. And, when I worked at the Times, we insisted on graphs marking the discontinuity in the scale, but they don’t that any more.

People have been kvetching about this for decades, but I wonder if there’s any evidence that countries with higher math scores are less susceptible to fuzzy thinking?
A: I don’t have any data.

Q: I’ve been looking at the conversion to IPv6 where the number of addresses is many orders of magnitude higher. People worry that we’ll run out of them, but that underestimates the magnitude of the IPv6 addresses.
A: It’s about 79,000 trillion trillions more addresses.

Q: We’re getting more info, but it’s sloppier. Will people pay a premium for proper info? A magazine has instituted fact editors. Writers have to show two sources for word sources, and reputable sources for figures. It helped.
A: The economics might bring us there.

Q: It’s easy to lie with stats, its even easier to lie without them by sing vague words.
A: Proofiness is good on this.

Q: [me] We ought to standardize on some new cliched measurements: from “length of a football field” or “books in the Lib of Congress” to “number of atoms in the universe”

Q: We need checkers!
A: A Mechanical Turk for checking.

Q: As the numbers get larger, is there more of a breakdown in whether people can understand them?
A: Megapenny.com is an interesting site, but at the end of the day it probably doesn’t help your intuition that much.

Q: Powers of Ten is great.

Q: The log scale of earthquakes is misleading to almost everyone.
A: Yes, and decibel scales, too.

Q: The same issues are probably happening with policy makers. We need an advanced class on this. Who would be taking it?
A: People take my class to satisfy a quantify reasoning requirement. There’s a class based on “Physics for Future Presidents.” It’d be great to have one on genomics, or psychology … lots of opportunities.

Q: JunkCharts is a good blog. It analyzes charts from around the Web.

7 Responses to “[berkman] Brian Kernighan on numeracy”

  1. [...] This post was mentioned on Twitter by MediamondBlogroll, Tammy Green. Tammy Green said: [berkman] Brian Kernighan on numeracy http://goo.gl/AcULE [...]

  2. The U.S. has approved Genetically Modified Sugar Beets.

  3. Joho the Blog » [berkman] Brian Kernighan on numeracy…

    Joho the Blog » [berkman] Brian Kernighan on numeracy…

  4. 6b people in the world, 1b in China.

    Ok, so there are certainly a ton of bad stats out there in popular journalism. Instead of highlighting the bad (easy to do), how about highlighting the consistently good? That’s a far more useful thing.

  5. “In-net-eracy” (see also “numero-in-net-eracy,” or numeracy-induced in-net-eracy), n.: the tendency of some otherwise highly net-savvy people, many of whom are also highly numerate, to assume that the operational requirements and constraints that will distinguish the designated successor (128-bit) IPv6 address pool from the original, now fully-allocated (32-bit) IPv4 address pool are transparently given by simple arithmetic reasoning.

    This defect in reasoning is one of several overt indicators of a general ignorance of the fundamental, mutually constitutive relationship that exists between individual network addressing formats and their associated routing protocols. One common variant of this error today stems from an assumption of uniform symmetry between IPv4 and IPv6 at all scales, i.e., the belief that IPv6 routing protocols and routing service providers will support the use of IPv6 addresses to connect devices to the network at the same level(s) of granularity that characterized IPv4-based routing.

    This belief is both false and deeply misleading, because the IPv6 address assignment hierarchy is effectively displaced “up” 1-2 levels relative to IPv4. Thus, with IPv6 every interface on a network device will be assigned appx. the same number of unique IPv6 addresses that the entire device would have been assigned under IPv4, and every single IPv6-attached device is likely to be assigned a quantity of unique addresses that might have been associated with a roomful of IPv4-connected devices. An intentional design feature of IPv6, this scaling displacement was intended to provide future network users with a globally unique (or end-to-end) addressing alternative to the now-widespread practice of using non-unique, “private” address ranges together with network address translation (NAT) technologies to multiplex individual unique IPv4 addresses.

    The IPv6 address pool is still huge for all *current* purposes, but not unimaginably so — especially if one goal is to give the future (e.g., generations, service providers, technologies, networking requirements…) the full measure of consideration that it deserves. Given the above considerations, a reasonable scaling assumption would be that IPv6 should be capable of accommodating appx. the same cumulative total number of “ISPs” as there are unique addresses in the IPv4 address pool (i.e., 4.2 billion), +/- 1-2 bits. True, given current demand rates (which have been shaped in part by conservation and uniqueness-preservation rules that were ridiculed as unnecessary by at least one vocal attendee), IPv6 could last for centuries, or even millennia. But isn’t that what we should want — if not absolutely demand? Considering that in just 2+ decades IPv4 “lock-in” has become so universal and absolute that the fate of IPv6 itself is now (still) quite uncertain, how hard is it likely to be to transition to whatever-comes-next after twice that long (or maybe 10x, 100x, 1000x…) under IPv6?

    IMO the goal should be “IPv6 until the Singularity (at least).”

  6. Hah! Thanks, Andrew S. I’ve fixed it in the text, and left my error (which was in this case more a result of fast typing than actually idiocy). Of course Brian K had presented the right number and label.

  7. [...] Joho the Blog » [berkman] Brian Kernighan on numeracy Brian Kernighan is giving a Berkman lunchtime talk called “Millions, Billions, Zillions: Why (In)numeracy matters.” Brian teaches at Princeton, but is at the Berkman Center this year, writing a book based on his undergrad course on what we need to know about computers. (Yes, Brian is that Brian K.) Brian teaches a course at Princeton on quantitative reasoning. He’s going to give us the “numeric self defense” portion of the course. (tags: statistics numeracy resources) [...]

Leave a Reply


Web Joho only

Comments (RSS).  RSS icon