Joho the Blogdefrag Archives - Joho the Blog

November 19, 2010

[2b2k] Curation = Relevancy ranking

I was talking with Sophia Liu at Defrag, and disagreed with her a bit about one thing she said as she was describing the dissertation she’s working on. I said that relevancy ranking and curation are different things. But then I thought about it for a moment and realized that given what my book (“Too Big to Know”) says, and what I had said that very morning in my talk at Defrag, they are not different at all, and Sophia was right.

Traditional filters filter out. You don’t see what fails to pass through them: You don’t see the books the library decided not to buy or the articles the newspaper decided not to publish. Filtering on the Web is different: When I blog my list of top ten movie reviews, all the other movie reviews are still available to you. Web filters filter forward, not out. Thus, curation on the Web consists of filtering forward, which is indistinguishable from relevancy ranking (although the relevancy is to my own sense of what’s important.)

1 Comment »

November 18, 2010

[defrag] JP Rangaswami

JP Rangaswami begins by talking about watching Short Circuit in 1986. Robots only have information and energy as inputs. What if we thought about humans as having the same inputs, JP wonders.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Think about cooking as the predigesting of food — making it easier for food to be digested. Cooks prepare food in external stomachs. Our brains evolved because we discovered how to cook. Can we look at information that way?

We talk about info overload, but not food overload. Having too much food isn’t a problem so long as we make sure that people have access to the excess. As JP thought trhough the further analogies between info and food, he realized there were three schools of how to prepare food. 1. The extraction school divides and extracts food, and serves them separately. 2. Another ferments food. You put foods together, and something new occurs. 3. Raw food is like the Maker generation of information: I want to fiddle with it myself, and I need to know that it came without additives.

We can think about what we do with information using these three distinctions. Some of us will work with the raw data. Some of us will prefer that others do that for us. Information should learn from food that it needs a sell-by date. E.g., look at how the media use Twitter. Twitter is a different type of food — more like raw — than you get through the institutional delivery methods.

Should we have an information diet? Would watching a single news outlet be the intellectual equivalent of the Morgan Spurlock “Supersize Me” movie? Maybe information overload is a consumption problem. We need to learn what is good for us, what is poison, what will make us unhealthy…

1 Comment »

[defrag] Jud Valeski

Jud Valeski of Gnip is talking about the rise of APIs. There’s ben a doubling of publicly available API’s in the past few years, Jud says. He shows Wired’s “The Web is dead” chart that shows the proportion of bits moving through the tubes. But, API usage shows the Web is not dead.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

He divides APIs into two buckets. Functional APIs make periodic calls for bits of information, and are heavily cacheable because there’s static data on the other side. Most of the problems are solved: REST has won, along with Curl.

Volume APIs are a different matter. Call frequency and throughput are high, “and things get wonky.” The call characteristics change. Local programming challenges fall out into the network and cause problems, e.g. queuing.

We don’t yet know how to deal with SLAs (service level agreements). Open network toopology APIs don’t have clear SLAs.

He minimizes the shock by leveragng best practices, finding comonoality in the frameowkr (mashery.com, apigee.com, gnip.com), builing APIs that set the standard.

Comments Off on [defrag] Jud Valeski

November 17, 2010

[defrag] Scott Porad on how we fileter 0,000 user submisses per day

Scott Porad from the Cheezburger Network, a network of humor and entertainment Web sites, including I can Has Cheezburger. Memebase, The Daily What, and Failblog.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

It’s mainly user-moderated. As an example, Scott takes us through the steps for the Cheezburger site.

First, the home tab where you can submit content. The LOL builder makes it easy for users to add captions to images. They get 300,000-500,000 submissions to their network every month, but they only publish 1-2 percent. How do they cull? There’s no secret sauce, no magic algorithms. It’s a four-step human process.

Step 1: All submissions are screened by an editor, looking for image quality (not taken on a cellphone at night, etc.), appropriateness (no nudity, violence, racism), germaneness (a dog photo submitted to the cat site?), and keeping photos of humans out. Most of what gets submitted is junk, and gets screened out.

Step 2: Using the second tab, users vote or add a submission to their favorites. They also look at which content has been shared on social networks.

Step 3: User screening for offensiveness and copyright violations.

Step 4: Editorial curation.

They tried outsourcing it, but there’s too much specific to our culture, and requires too much editorial judgment.

Scott shows us his the favorite photos in his own account profile. ([Some very funny ones.]

Comments Off on [defrag] Scott Porad on how we fileter 0,000 user submisses per day

[defrag] Esther Dyson on personal health data

Esther Dyson is giving a talk at Defrag. It’s called “On exploration … of yourself.”

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing A LOT of artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

“Everyone wants to know themselves, but some people are afraid of their genome.” She tells such people that the question is not what you’re going to die of, but what you’re going to live with. She wants to show us some cool interfaces that make data about yourself more interesting.

23andme.com (Esther discloses that she’s on the board) shows you your disease risks [based on your genome?]. It presents some friendly screens and lets you drill down. You can compare your genome to your relatives’. Esther says she found a lump in her breast this summer. It was benign, but before she found out, she reassessed her odds, which led her to think that the risk of going into space had dropped in comparison to the cancer risk. We need numeracy, she says.

Keas.com also produces a friendly health profile, she says.

But what counts is motivation, she says. It’d be helpful if we could increase the status of health markers, e.g., that you run 20 miles a week, etc. How do you design systems, services and tools where your healthy behavior connotes status?

She points to one not very effective attempt: TripIt.com shows her status in various frequent flyer programs, but ought to show her good behaviors (exercise, flossing, etc.).

She suggest someone here create the game Bodyville.com.

There are three health markets now: The health care market (doctors, hospitals, insurance, etc.). Chocolate muffins, and indolence. And the third market is for health, which hasn’t been much of a market.

Q: How about privacy?
A: With universal healthcare, the data have less of an impact. The data can still affect employability, etc. Privacy remains an issue, although your financial data is much more interesting to thieves. We’ve managed to deal with financial data pretty well. If you’re worried about your health data’s privacy, then don’t use this stuff. It’s somewhat overblown as an issue. I’ve put my entire genome up on the Web — 20Mb, and it doesn’t have a lot of meaning about it yet. Your behavior is much more revealing than your genome right now.

Q: How about data sharing tools?
A: Here are two I’ve invested in: Contagion Health. Health Rally. Suppose your friends invest in your not smoking? That creates a positive community and you don’t want to disappopint them. Med Rewards. PatientsLikeMe and CuredTogether.

Q: There can be unintended consequences, such as BMW’s mileage game leading people to run red lights. How do you avoid that?
A: Yes, I can see one of those tools aggravating anorexia. This things need to be designed carefully.

Q: Are you going into space?
A: Yes, I’d love to. I’d even go to Mars one way. That’s what they did to America in 1942. The older you get, the less you have to lose.

Q: How about how humans take to 3D visualizations?
A: Some like it, some don’t. I do. I love 4D with things changing over time. But not everyone likes them. Remember not to confused the visualization with the meaning. Some are cool but don’t convey any info. Read Tufte.

Comments Off on [defrag] Esther Dyson on personal health data