Joho the Blog » 2009 » March

March 29, 2009

The Google Book deal

I just heard Robert Darnton on On The Media talking about the Google Book settlement. (Sorry, but I don’t yet see a link specifically to that interview.) Brilliant. The two things I’d recommend reading about this massive and massively important deal are Darnton’s piece in the NY Review of Books, and an article by James Grimmelmann.

The book settlement is hugely complex, hugely important, and overall a big step forward. But, the ur-cause of the issues many of us have with it is that it’s a settlement among authors, publishers and Google, which leaves readers, scholars, teachers — AKA the rest of us — out.

[Tags: ]

2 Comments »

March 28, 2009

Q: How do you know when your question-asking site is broken?

A: When you get 104,003 questions for the President.

I applaud the Obama administration for soliciting online questions for the President’s online town hall. And they let us all see the questions that our fellow citizens (of the US and the world) were submitting. Excellent!

But if you get that many different questions, it’s pretty much guaranteed that you really got far fewer unique questions. If people can’t easily find the question they had, they asked it again. This dissipates the votes on the questions as well.

I don’t know how to fix it other than by manual intervention, or possibly automagic natural language processing, or some such. Or maybe you could show people questions like the one they just posed (through just a little bit of automagic NLP) and offer to let them vote for those questions rather than pose their own. This might cause some clustering around questions: Why ask “You, dude, when are you going to make pot legal? PS: You can come by our place in White Plains any time if you do.” when you’re shown that the question, “Do you support the legalization and taxation of marijuana?” already has 983,455 votes?

[Tags: ]

12 Comments »

March 27, 2009

Long-tail museum

Jeff Gates posts about how the Smithsonian American Art Museum is facing the fact that it’s a long-tail phenomenon:

Our Web statistics showed that the number of visitors to our top ten sections paled when compared with the total number of visitors for all other pages, even though only a few people viewed each page. The challenge: how could we make it easier for our online visitors to find things of interest even if that information is buried deep in our site?

He continues:

Museums are changing. Like many other organizations, our hierarchical structure has historically disseminated information from our experts to our visitors. The envisioned twenty-first-century model, however, is more level. Instead of a one-way presentation, our online visitors are often interested in having a conversation with our curators and content providers. In response, many of us at American Art have been looking for ways to engage our public by designing applications that promote dialogue. By encouraging user-generated content and by distributing our assets beyond our own Web site and out across the Internet, we hope to make our content easier to find. In doing so, we are trying to fulfill our long tail strategy. In order to succeed we will need to approach our jobs differently.

And that’s just the introduction.

Meanwhile, the Library of Congress has expanded on its successful 15.7M views Flickr experiment and is now posting material at iTunes and YouTube.

Among the items Web surfers can expect on iTunes and YouTube are 100-year-old films from Thomas Edison’s studio, book talks with contemporary authors, early industrial films from Westinghouse factories, first-person audio accounts of life in slavery, and inside looks into the library’s holdings, including the rough draft of the Declaration of Independence and the contents of President Abraham Lincoln’s pockets on the night of his assassination.

This is all getting just too cool. Time to put the toys back on shelves behind glass

Nah.

[Tags: ]

2 Comments »

New York Public Library blog

The NYPL blog is nicely eclectic, the way a libraries tend to be. It’s for people who find interesting the sorts of topics covered in books (or magazines or photos…).

[Tags: ]

Be the first to comment »

March 26, 2009

Data in its untamed abundance gives rise to meaning

Seb Schmoller points to a terrific article by Google’s Alon Halevy, Peter Norvig, and Fernando Pereira about two ways to get meaning out of information. Their example is machine translation of natural language where there is so much translated material available for computers to learn from, which (they argue) works better than trying to learn from attempts that go up a level of abstraction and try to categorize and conceptualize the language. Scale wins. Or, as the article says, “But invariably, simple models and a lot of data trump more elaborate models based on less data.”


They then use this to distinguish the Semantic Web from “Semantic Interpretation.” The latter “deals with imprecise, ambiguous natural languages,” as opposed to aiming at data and application interoperability. “The problem of semantic interpretation remains: using a Semantic Web formalism just means that semantic interpretation must be done on shorter strings that fall between angle brackets.” Oh snap! “What we need are methods to infer relationships between column headers or mentions of entities in the world.” “Web-scale data” to the rescue! This is basic the same problem as translating from one language to another, given a large enough corpus of translations: We have a Web-scale collection of tables with column headers and content, so we should be able to algorithmically recognize clustering concordances of meaning.

I’m not doing the paper justice because I can’t, although it’s written quite clearly. But I find it fascinating. [Tags: ]

1 Comment »

Online anonymity challenged by courts in Ontario and Illinois

Michael Geist posts about an Ontario court decision to require FreeDominion.ca to reveal the identity of anonymous poster:

Protection for anonymous postings is certainly not an absolute, but a high threshold that requires prima facie evidence supporting the plaintiff’s claim is critical to ensuring that a proper balance is struck between the rights of a plaintiff (whether in a defamation or copyright case) and the privacy and free speech rights of the poster. … I fear that the high threshold seems to have been abandoned here….

Meanwhile, as I noted yesterday, Berkman‘s Citizen Media Law Project has filed an amicus (= friend of the court) brief in a case in Illinois. From the press release:

“Courts around the country have recognized that, although the right of free speech is not absolute, a plaintiff must show that its claims are legally and factually tenable before a court orders that the identity of an anonymous speaker be disclosed,” noted CMLP Assistant Director Sam Bayard. “Anonymous speech on blogs, online fora, and other websites leads to a vibrant exchange of information, and putting a plaintiff to its proofs before unmasking an online commenter helps to ensure constitutionally-protected speech is not chilled.”

[Tags: ]

Be the first to comment »

March 25, 2009

Making it harder to de-anonymize speakers

From a press release:

In a case involving important First Amendment rights, the Citizen Media Law Project (“CMLP”) joined a number of media and advocacy organizations, including Gannett Co., Inc., Hearst Corporation, Illinois Press Association, Online News Association, Public Citizen, Reporters Committee for Freedom of the Press, and Tribune Company, in asking an Illinois appellate court to protect the rights of anonymous speakers online by imposing procedural safeguards before requiring that their identities be disclosed.

The CMLP is a Berkman project. More here…

[Tags: ]

3 Comments »

March 24, 2009

Susan Crawford goes to the White House [REVISED]

[April 1, but no joke: I spoke with Susan a couple of days ago and de-confirmed this "news." National Journal got it wrong, and I repeated it, perpetuating the error. Sorry. Susan is indeed part of the Obama team, but reporting to Larry Summers, advising on tech policy, which is indeed fantastic. And true.]

Fantastic news:

Internet law expert Susan Crawford has joined President Barack Obama’s
lineup of tech policy experts at the White House, according to several
sources. She will likely hold the title of special assistant to the
president for science, technology, and innovation policy, they said.
Crawford, who was most recently a visiting professor at the University
of Michigan and at Yale Law School, was tapped by Obama’s transition
team in November to co-chair its FCC review process with University of
Pennsylvania professor Kevin Werbach. Her official administration
appointment has not been formally announced. Crawford may be best
known for her work with the Internet Corporation for Assigned Names
and Numbers, the California-based nonprofit group that manages the
Internet address system. She served on ICANN’s board for three years
beginning in December 2005. She also founded OneWebDay, a global Earth
Day for the Internet that takes place every Sept. 22. Crawford, a Yale
graduate, clerked for U.S. District Judge Raymond Dearie before
joining Wilmer, Cutler & Pickering where she worked until the end of
2002.

[Tags: ]

2 Comments »

[berkman] Doc Searls

Doc Searls is giving a Berkman lunch called “The Intention Economy.” [Note: I'm live-blogging, missing points, paraphrasing badly, making spellping errors, etc.

He begins by talking about some problems. E.g., "the people vs. Comcast." Customers are unhappy. "Comcast can't fix itself alone." Or, customer loyalty cards that are the Green Stamps of our time. "They leverage something that's broken about e-commerce." E.g., the Harvard Co-op gives a 10% "discount" if you join. But they make you enter a ton of personal data, the same data you enter at every other e-comm site. Or public radio: Everyone in the room listens, but only about half give. Doc would like to be able to give to support particular programs.

The problem in all these cases is Customer Relationship Management (CRM). CRM is not about relating. "The problem is that most big businesses think that the best customer is a captive one." "That's why the free market is still your choice of captor." But "we're now about three minutes into the Big Bang" when it comes to the Net. The challenge is to "prove that a free customer is more valuable than a captive one."

So, Doc has started Project VRM (vendor relationship management) to provide ways for customers to drive relationships with vendors. "With VRM, the individual is the point of integration for his or her own data" and is also the "point of origination of what's done with" that data. There have been VRM meetups across Europe and North America.

VRM is an open source project (although there are some commercial projects underway also). Doc talks briefly [too quickly for me to keep up] about some of the people involved. Likewise for projects: Personal health info. “Personal RFPs” where a customer sends a query to vendors for bids on things the customer wants to buy. The user wouldn’t give away any unnecessary info. Also: Making terms of service readable and user-focused.

Doc spends a little more time on creating a new business model for free media that isn’t advertising. Free media first means non-commercial media, but ultimately for blogs, etc. The model is temporarily named “PayChoice,” and is based on letting individuals pay how much they want when they want for what they want. The Public Radio tuner is one result. 1.3M have downloaded it into their iPhones already. It turns your iPhone into a radio tuned into public radio. It enables listeners to hold up their end of relationship. The “R” button lets a user pay for what she wants. But it’s not just for paying. It could also represent an intention to buy, and intention to sell, etc.

So, what happens when customers get real power?

- “Customers get their own pricing guns” [i.e., the "guns" that print out price labels].

- “The intention economy” will get real because it’s based on what customers really want, as opposed to the attention economy that’s based on guesses.

- “The advertising bubble will burst.” There will still be ads, but they won’t be the “communications method of first resort.”

- “Cluetrain will finally be right.”

Q: What about eBay?
A: There are lots of sites that do this, but why should we only have sites? Your eBay reputation is only inside eBay. Why should it be stuck there? We want service portability.

Q: What will be the method conveying your desires to companies? A third party service? A non-profit?
A: On the public radio tuner, the “listen log” keeps track of what you’ve listened to. Ideally that would sit on our own computers in encrypted form. Some of that we’re solving with Ian Henderson’s personal data store, some with Lukas’ The Mine. But let’s say we have that solved. Right now, we use “third parties,” which generally live on the vendor’s side. We see a fourth party business, driven by users. E.g., with music, it’d be good to be able to set a price on the music you stream. Some fourth party business will pull that money together. We’re working on a chapter-based association for user-driven services.

Q: So you create sort of a DNS service…?
A: One model is RSS. It’d be good to be able to advertise your needs, possibly through RSS. Maybe it’s tag-based, maybe it’s anonymous.

Q: What do you envision for traditional companies dealing with this?
A: Let’s we have our own loyalty card. As customers inject more intelligence into the marketplace about what they’re willing to say about themselves, we’ll see things like fact-checking of vendors’ claims against us; it’d be cool if the customer could as a data backup. I don’t see a downside for traditional customers. More intelligence and more good will in the market will benefit everyone. It’s a fallacy to think that people only shop on price. Starbucks proves the contrary

Q: [me] Situate this in micropayments and tipjars, and identity management.
A: We’re doing micro-accounting, not micropayments. Small payments are accumulated. Micropayments haven’t worked for anyone except the phone company, and they abused it. WRT identity: I’ve been interested in that for a long time. Along the way, Andre Durand (of Jabber) once said that we have to get identity worked out. Identities are given to us by other corporations: what the DMV, the library, Visa (etc) tell us who we are. Andrew thought this was backwards. We have to reverse it. I now think that that’s important, but it’s separate from VRM. There are times when identity isn’t used at all. My wife about 15 years ago asked why we can’t take our shopping cart from one site to another. And when I was working with the ID management folks, my wife said she wants less identity, not more. Adriana Lukas’ The Mine project is intended to work independent of any identity system. The whole identity movement is a separate thing that overlaps VRM somewhat. VRM isn’t part of the identity space.

What happens on the aggregate level? A lot of CRM is about companies aggregating anonymized data and using it for recommendations, etc.
A: Companies will continue to gather intelligence about us. Companies can improve that. Amazon’s recommendations are the best, but they’re still broken. Your kids use your computers and your recos go off track. Or you buy one book and Amazon thinks you’re interested in the category. Those recos are still guesswork. And they don’t know what only you know, and what’s outside their system.

Comcast is actively providing what I don’t want because they want to sell more on-demand. Do you see VRM breaking down those monopolies?
A: Cable TV is really broken. We have Verizon FIOS. The TV is fantastic. But they only provide 20MB for Internet. For us that’s backward. I tried canceling, and they came back with an offer that reflects their real costs. But we don’t watch TV, so we still said no. I offered to pay a la carte, but nope.

Q: What are the enabling technologies for VRM? If companies still haven’t figured out how to do this, what do you have to provide?
A: Money. If there’s money left on the table…We’re doing field of dreams here.

Q: Thinking about Linked Data/RDF for putting this data out in a much richer way? It’s the rich, decentralized model you’re looking for.
A: The short answer is no, but the longer answer is sure. We’re in touch with those folks. It’s a matter of who shows up.

Q: Is this more generational?
A: I don’t know. It’s whoever shows up. We need to make stuff that benefits everyone.

Q: What about characterizing the ecosystem you’re trying to build with certification levels of VRM? Companies could advertise that they’re at different levels of VRMitude.
A: We have a draft of this, on the wiki: ProjectVRM.org We also want a list of core principles.

Q: How do you balance the explicit data sharing in advertising intent (“I’m looking for a car”) with the fact that sites are selling that data to vendors?
A: The whole VRM idea came out of one use case: car rental. The variables are never what they’re offering. E.g., I want to be able to get a car that plays MP3 CDs. As more customers can advertise their needs, it will change those businesses, and probably discourage the profligate sharing of information.

Q: What about customized fabrication, i.e., making products in response to customer desires. What does this do to branding?
A: Some companies are going to succeed by giving people what they want. We’re all different and want different things. That’s what the Net will come down to eventually.

Q: Insurance companies and lendors have competitive vendors markets. Imagine that for car rentals…
A: That’s an example of a personal RFP. It’s an example of a substitutable service.

Q: Individuals will never be on an equal basis with, say, Verizon. What about collaboration?
A: I avoided that. We don’t want to start with the collective and move to the personal. We want to start with the personal. We need lots of individuals doing VRM for it to work. We want this to be a victory for Verizon as well.

Q: It’s going to be hard to get businesses out of the captive customer mindset. Is VRM a pipe dream? Will companies fail and VRM-ish ones arise?
A: All of the above. Some leopards won’t change their stripes. They’ll also have to wake up and smell the coffee.

Q: What about the cultural domain? NGOs?
A: Huge opportunities. Britt Blaser is working on Government Relationship Management. A lot of great opportunities came out of the Obama campaign. There’s a great outfit in the UK with a site called fixmystreet.org: post photos of potholes and the local gov’t patches them. Being able to express what you’re looking for will work with any type of organization. Take Relationship Management and stick another letter in front of it. We want the demand side and supply side to get along. [Tags: ]

7 Comments »

Ada Lovelace’s Internet freedom brigade: Wendy Seltzer

It’s the first International Ada Lovelace Day, when we celebrate women in tech by blogging about a woman in tech. My choice this year is Wendy Seltzer. This list of projects she’s been instrumental in of course does not tell the whole story, but it’s a good place to start.

Wendy graduated Harvard Law with a ticket to high-priced everything, but instead has dedicated her legal skill and deep technical understanding to preserving the Net as a place for free speech and free culture. She was a lawyer for EFF for years, an original and sustaining Berkman fellow, a careful observer of ICANN, the heart, head and hands behind ChillingEffects.org, and someone who never hesitates to pitch in when there’s a way to keep the Net open to all.

Wendy is modest and shy, and will undoubtedly be made uncomfortable by my singling her out. But, hey, what are friends for? :) Wendy, it makes me happy to know you are working for us all, and even happier to call you my friend.

[Tags: ]

3 Comments »

« Previous Page | Next Page »