In August, I blogged about a mangled quotation supposedly from Mark Twain posted on an interstitial page at Forbes.com. When I tweeted about the post, it was (thanks to John Overholt [twitter:JohnOverholt]) noticed by Quote Investigator [twitter:QuoteResearch] , who over the course of a few hours tweeted the results of his investigation. Yes, it was mangled. No, it was not Twain. It was probably Christian Bovee. Quote Investigator, who goes by the pen name Garson O’Toole, has now posted on his site at greater length about this investigation.
It’s been clear from the beginning of the Web that it gives us access to experts on topics we never even thought of. As the Web has become more social, and as conversations have become scaled up, these crazy-smart experts are no longer nestling at home. They’re showing up like genies summoned by the incantation of particular words. We see this at Twitter, Reddit, and other sites with large populations and open-circle conversations.
This is a great thing, especially if the conversational space is engineered to give prominence to the contributions of drive-by experts. We want to take advantage of the fact that if enough people are in a conversation, one of them will be an expert.
Tagged with: 2b2k
Date: October 27th, 2013 dw
Yesterday I participated as a color commentator in a 90 minute debate between Clive Thompson [twitter:pomeranian99] and Steve Easterbrook [twitter:smeasterbrook], put on by the CBC’s Q program.The topic was “Does the Net Make Us Smart or Stupid?” It airs today, and you can hear it here.
It was a really good discussion between Clive and Steve, without any of the trumped up argumentativeness that too often mars this type of public conversation. It was, of course, too short, but with a topic like this, we want it to bust its bounds, don’t we?
My participation was minimal, but that’s why we have blogs, right? So, here are two points I would have liked to pursue further.
First, if we’re going to ask if the Net makes us smart or stupid, we have to ask who we’re talking about. More exactly, who in what roles? So, I’d say that the Net’s made me stupider in that I spend more of my time chasing down trivialities. I know more about Miley Cyrus than I would have in the old days. Now I find that I’m interested in the Miley Phenomenon — the media’s treatment, the role of celebrity, the sexualization of everything, etc. — whereas before I would never have felt it worth a trip to the library or the purchase of an issue of Tiger Beat or whatever. (Let me be clear: I’m not that interested. But that’s the point: it’s all now just a click away.)
On the other hand, if you ask if the Net has made scholars and experts smarter, I think the answer has to be an almost unmitigated yes. Find me a scholar or expert who would turn off the Net when pursuing her topic. All discussions of whether the Net makes us smarter I think should begin by considering those who are in the business of being smart, as we all are at some points during the day.
Now, that’s not really as clear a distinction as I’d like. It’s possible to argue that the Net’s made experts stupider because it’s enabled people to become instant “experts” on topics. (Hat tip to Visiona-ary [twitter:0penCV] who independently raised this on Twitter.) We can delude ourselves into thinking we’re experts because we’ve skimmed the Wikipedia article or read an undergrad’s C- post about it. But is it really a bad thing that we can now get a quick gulp of knowledge in a field that we haven’t studied and probably never will study in depth? Only if we don’t recognize that we are just skimmers. At that point we find ourselves seriously arguing with a physicist about information’s behavior at the event horizon of a black hole as if we actually knew what we were talking about. Or, worse, we find ourselves disregarding our physician’s advice because we read something on the Internet. Humility is 95% of knowledge.
Here’s a place where learning some of the skills of journalists would be helpful for us all. (See Dan Gillmor‘s MediActive for more on this.) After all, the primary skill of a particular class of journalists is their ability to speak for experts in a field in which the journalist is not her/himself expert. Journalists, however, know how to figure out who to consult, and don’t confuse themselves with experts themselves. Modern media literacy means learning some of the skills and all of the humility of good journalists.
Second, Clive Thompson made the excellent and hugely important point that knowledge is now becoming public. In the radio show, I tried to elaborate on that in a way that I’m confident Clive already agrees with by saying that it’s not just public, it’s social, and not just social, but networked. Jian Ghomeshi, the host, raised the question of misinformation on the Net by pointing to Reddit‘s misidentification of one of the Boston bombers. He even played a touching and troubling clip by the innocent person’s brother talking about the permanent damage this did to the family. Now, every time you look up “Sunil Tripathi” on the Web, you’ll see him misidentified as a suspect in the bombing.
I responded ineffectively by pointing to Judith Miller’s year of misreporting for the NY Times that helped move us into a war, to make the point that all media are error prone. Clive did a better job by citing a researcher who fact checked an entire issue of a newspaper and uncovered a plethora of errors (mainly small, I assume) that were never corrected and that are preserved forever in the digital edition of that paper.
But I didn’t get a chance to say the thing that I think matters more. So, go ahead and google “Sunil Tripathi”. You will have to work at finding anything that identifies him as the Boston Bomber. Instead, the results are about his being wrongly identified, and about his suicide (which apparently occurred before the false accusations were made).
None of this excuses the exuberantly irresponsible way a subreddit (i.e., a topic-based discussion) at Reddit accused him. And it’s easy to imagine a case in which such a horrible mistake could have driven someone to suicide. But that’s not my point. My point here is twofold.
First, the idea that false ideas once published on the Net continue forever uncorrected is not always the case. If we’re taking as our example ideas that are clearly wrong and are important, the corrections will usually be more obvious and available to us than in the prior media ecology. (That doesn’t relieve us of the responsibility of getting facts right in the first place.)
Second, this is why I keep insisting that knowledge now lives in networks the way it used to live in books or newspapers. You get the truth not in any single chunk but in the web of chunks that are arguing, correcting, and arguing about the corrections. This, however, means that knowledge is an argument, or a conversation, or is more like the webs of contention that characterize the field of living scholarship. There was an advantage to the old ecosystem in which there was a known path to authoritative opinions, but there were problems with that old system as well.
That’s why it irks me to take any one failure, such as the attempt to crowdsource the identification of the Boston murderers, as a trump card in the argument the Net makes us stupider. To do so is to confuse the Net with an aggregation of public utterances. That misses the transformative character of the networking of knowledge. The Net’s essential character is that it’s a network, that it’s connected. We therefore have to look at the network that arose around those tragically wrong accusations.
So, search for Sunil Tripathi at Reddit.com and you will find a list of discussions at Reddit about how wrong the accusation was, how ill-suited Reddit is for such investigations, and how the ethos and culture of Reddit led to the confident condemning of an innocent person. That network of discussion — which obviously extends far beyond Reddit’s borders — is the real phenomenon…”real” in the sense that the accusations themselves arose from a network and were very quickly absorbed into a web of correction, introspection, and contextualization.
The network is the primary unit of knowledge now. For better and for worse.
Tagged with: 2b2k
Date: October 23rd, 2013 dw
Erik Martin is giving a talk at the Nieman Foundation. He’s the general manager of Reddit.com. (Disclosure: We’re friendly.) He tells us that Reddit gets 5 billion page views per month, and 70 million unique visitors.
NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.
Erik gives us a tour and some background. Every morning he clicks on the “Random” button and visits the subreddits (= topically-based pages within the site) the button gives him. He does so now, hitting subreddits such as bitch, i’m a bus, ukele, battlestations (office desks), and what’s this plant. Reddit, he says, is like a giant message board. You can create a board (subreddit) about anything. There are over 100,000 that get at least a post a day, and 6,000 that have substantial activity. All the subreddits are created by users, who also can create the page design. All the posts are voted up or down by users. Users also set the rules for subreddits. For example, at the Coversong subreddit, users have apparently decided all posts have to be videos.
Now he’s interviewed by Justin Ellis.
JE: How did you get to Reddit?
EM: He worked for Mammoth Records. It got bought by Disney. Then hecame a documentary filmmaker. Then marketing films and distributing them online. He read Hackers and Painters by Paul Graham) [great book]. He then read about Paul Graham’s Y Combinator incubator. He applied to do a documentary about it, but was rejected. Still, he was hooked. Reddit came out of the first round of projects. He saw Reddit and loved the unpredictability of it. “Every link as a rabbit hole you might go down.” He got to know the cofounders and said “IU want to find a way to work with Reddit because that’s what I’m doing with all my time.” Alexis Ohanian asked him to work on a TV pilot that was going to incorporate Reddit into a news show. But it didn’t work; the Internet part was an add-on. Then he got hired as a community manager at Reddit.
JE: Reddit has a lot of geography. What does it mean to be a community manager?
EM: He looked at it as being the manager of a band. He’d promote promising items. He’d try to keep things functioning. And he tried to make sure that the community didn’t get taken advantage of, e.g., when people didn’t link back to Reddit.
JE: When you create a subreddit and a crowd shows up, how does that happen?
EM: Sometimes it’s obvious why. But others we can’t figure it out. One of our most popular subreddits is Explain Like I’m Five. That one you know what you’re going to get. Same for Ask Me Anything. Those explode when hot topics arise.
JE: How does this community stay together so long?
EM: Some of it is the customization of subreddits.
JE: Because anyone can create a subreddit, Reddit has gotten into trouble from time to time. There have been some very creepy subreddits. What’s the guiding principle for what is allowable?
EM: Our philosophy is that it’s a site that has 5B page views, and we have 35 employees [so we can’t moderate everything]. If you’re going to function you have to have some rules, but they have to be relatively finite, relatively easy to understand, and relatively self-enforceable. So, we have six rules. We have added one or two throughout the years. We try to keep them simple. No spam. You can’t try to break the site. You can’t try to cheat. You can’t put people’s personal info up. You can’t have anything illegal. We added that you can’t have material that sexualizes minors. If we had one that said “Don’t be a jerk,” it wouldn’t be enfrceable. No one would agree about how it applies. So there’s tons of stuff on the site that we find horrible and offensive, but the site works best when we keep it open and governed by those simple rules.
JE: What responsibility do you think you have if you see something that you personally feel is wrong?
EM: What I find offensive is different from others around the world or other positions. People don’t come here because they think we have the best judgment about what’s offensive. Plus, you have all the context. E.g., people complain about the PicsOfDeadChildren subreddit. That’s obviously very offensive. But what if it were called “Child Autopsy Photos” and it put itself forward as presenting medical training photos. Or a subreddit about death. Or a subreddit about combat video. It’s beyond offensive. It’s people being killed. It gets very tricky.
JE: There have been 3 major stories illustrative of Reddit and citizen journalism: The Aurora movie theater shooting, the Boston Marathon bombing, and the shooting at the Navy Yard in DC. In the first, there was first person reporting. With the second, there was that but also the spreading of info from elsewhere and then the misidentification of one of the suspects in the bombing. With the third, someone created a subreddit to investigate what was happening, but you guys shut that down. What have you learned?
EM: In those three situations, the response of the community was the same as what you’d see offline: People trying to figure out what went on. Telling their story. Making jokes. Speculating about all kinds of things. Trying to make sense of what happened. Later on they were trying to help in some way. With Boston, it was different because the authorities wanted help from the public: they said if you have photos, upload them, etc. There was a subreddit where people were trying to identify the bombers, and that got a lot of attention. The actual subreddit where the Brown Univ. student was misidentified by name was actually the normal Boston subreddit, and it was removed after about an hour. That wasn’t good enough. That led to horrible consequences for that family.
So, what have we learned? We learned that people want to share, to talk, to help, to be a part of these huge events any way they can. We learned people can be callous and cavalier by mentioning people’s name. The vast majority were careful and thoughtful, but some were not. The Navy Yard subreddit was a joke. It had six posts, most from journalists satirizing the Boston bombing subreddit. It went against our rules and we shut it down after an hour.
JE: But you apologized after the Boston bombings…
EM: Absolutely. We do post-mortems and followsup. We did one when President Obama came on. So, yes, we apologized and talked aout what we can do better. And we also talked about the amazing things people did: people bringing their pets to parks in case people needed cute animal therapy, the sending of pizzas to EMTs and the police… We are an open source site in policy as well as code.
JE: Is it enough to do a post mortem? Newspapers issue corrections.
EM: There are thousands of subreddts, so there isn’t a way to reach everyone. We’re a platform, not a newspaper. We’re like Twitter or Youtube or WordPress. We don’t have a position on the veracity of one thing or another. I hope people learn to be more empathetic nandlearn that what you say on line has repercussions. But I don’t think we’re like a publication, and we’re not an editorial team.
JE: How do you see the role of journalism on Reddit? Why are people doing self-reporting?
EM: They want to be part of the story. They don’t want to be passie about what’s happening in the world. Even if
it’s uploading a meme. They’ve seen something start and then get big in a single day. Of course they want to share what’s happening in their neighborhood or share their thoughts about what’s going on in their govt Redditors vote 20M time a day.
JE: What’s the relation of journalisms and Reddit?
EM: We’re agnostic about what you’re linking to. But original reporting is more important than ever because people can find an audience. What’s happening on Reddit and what’s happening in the mainstream media happen to be in different hemispheres now but ultimately it’s the same thing. I hope people doing reporting will be active in a comment thread on Reddit or elsewhere.
JE: But you are creating content in some way, e.g., the Ask Me Anything’s where anyone can come in answer questions from the community. It’s very much like what media companies do.
EM: And in other Reddits people share recipes or workout routines. It’s like what you get in the media. It’s communicating, it’s story telling.
JE: How do you make money? You have ads and Reddit gold memberships.
EN: We don’t need to make a lot of money. We’re very lean. Our NY office is in a coworking space. We basically have ads for big movies, mobile phones, etc. We also have ads from mom and pop companies. Reddit Gold is a premium membership, $24.99/year. You get some extra features but most people do it to support the site. We have a secret Santa program (Reddit Gifts) that has an e-commerce site to help those exchanges and to make money.
JE: Reddit was purchased by Conde Nast and then spun off in 2011. How is it different?
EM: We started in 2005. Bought by Conde Nast in 2006. I started in 2008. Reddit was basically neglected by Conde: we were growing but there was a hiring freeze. OTOH, no one told us what to do. An example of how it made a difference: Before we were spun out, our ad operations was done through Conde, which is great for major magazines, not for a weird site where all you need is $5 to run an ad. So it didn’t make sense for us. We wanted an ad server that was fast and open source, which now we have.
Q: Any trends in the type of content being produced? Trending toward the absurd? Or what?
A: It gets harder and harder to think about overall trends because the site is becoming more fractious and disparate each day. I think people are really motivated by the unexpected. Our audience is increasingly cynical. We also have an audience that is increasingly idealistic. You see trends were people are more connected across national and geographical boundaries; if there’s a discussion on healthcare the top comments will be from people around the globe. And it’s always been possible to have the serious next to the ridiculous; the last remaining bulkheads are being whittled away.
Q: Can you remain content agnostic?
A: No, it’s not possible. We’re not content agnostic towards spam or personal information. We try to be as close to agnosstic as we can.
Q: How much does porn account for your content?
A: About 85% of the subreddits are safe for work. (The Trees subreddit is not because you could get in trouble looking at pictures of weed.) Porn is maybe 5-10%. Our biggest subreddits are the video subreddits, As Reddit, etc.
Q: Terrorists radicalize by looking at pictures of dead babies. Have you had to hand over who your users are to agencies trying to track people on Reddit trying to radicalize people?
A: User privacy is core but we comply with what we have to comply with.
Q: [me] Reddit used to have a strong culture. People knew the same references, were playing the same games, had the same general politics, etc. But that shared culture seems to be weakening as Reddit becomes more popular. Does this concern you??
A: Yes, there is a certain sense of shared community that’s being fractured. But it’s being migrated down the subreddits the way you’re more loyal to community or borough.
Q: [me] Can you say more about IAMA’s, which at their best are a quite remarkable journalist form of collaborative interview?
A: The exciting thing for me is to see that format seep into other subreddits. We actively are trying to encourage that. E.g., mayoral candidates should do AMAs in their city’s subreddit. Or scifi authors are doing them in the sf subreddits. It goes back to that idea of so much of the word being predictable. If you waatch watch an interview on even some of the great programs — Charlie Rose, for example — even if they’re really good, you know what to expect. With the Reddit AMA’s not only do you not know what sort of questions are going to be asked, since you can answer a question at any length, it ends up taking this unexpected terms. If you look at the calendar of upcoming IAMA’s, you don’t even know which ones are going to be popular, outside of a Bill Gates or Tom Hanks, but if you look at the top AMAs for a week it will be a celebrity, subway driver, person with a weird disease, and way down the list will be someone with a household name. It’s unpredictable, and it’s unpredictable to the person being interviewed. It’s very different from what you get on a press junket where people go into robot mode. The AMA format can be more fun for them the standard press interview.
Q: Tumbler did a lot of active outreach to media. You don’t go out to, say, Newsweek and ask if they want a subreddit.
A: Yes. It’s difficult for us to do. Tech News Today is a great subreddit. They don’t directly flog their content. PBS has done one. But it’s hard.
Q: A newspaper could have its own subreddit where their folks are doing AMA’s etc.
A: Yes. But curating and cultivating a subreddit is a lot of work. It’s hard enough getting journalists to participate in comments on their own site.
Q: Companies you wouldn’t expect have made editorial plays. E.g., Twitter has being hiring editorial staff. Why are they doing that?
A: We’ve done some of that to prime the pump. E.g., Adam Savage’s publicist would probably say no to a request for an AMA at a site that looks like it’s from the 1990s [like ours], but if I go out with a camera and ask him to respond to the top ten questions, they might say yes. But then they see that the AMA works. So we only do editorial work for pump priming.
Q: What’s up with the design?
A: Look at the big sites. Minimal but flexible platforms. When you start doing a more professional and complex design, you suddenly needing 10x more people, and then you need 10x the money…But subreddits can monkey with the CSS. They can even change the Gold button, our “buy” button. Rich text works.
Q: For a traditional news org, the misidentification of the Boston Bomber would have been very expensive. Who owns the error from a legal perspective, in the US and elsewhere?
A: In the US, platforms are not responsible for what people say. The person who says it is responsible. I don’t know if Reddit could exist as a Canadian company. People give us a non-exclusive contract to display their words.
Q: But because you have some rules, doesn’t that make you responsible?
A: The more you monitor, the more responsible you are. But everything on the site is determined by human behavior. We are a platform for people discussing things. We’re not a publication. We don’t have editorial control.
Q: Is one of your 35 people a lawyer?
Q: So when you get subpoenas…?
A: We’ve had to learn more than we want. We also have very good lawyers we consult with when we need to.
Q: The site in 5 years?
A: I don’t know. The users have better ideas than we do. All we try to do is take ideas they develop and help make them happen. So, in 5 years I think Reddit will be in more countries, more cross-country conversation. We have great engineers so we’ll be doing more interesting things. In 5 years I hope there will be 1,000 Reddit apps, using Reddit in novel ways that I couldn’t come up with. I never imagined that Reddit would be useful for live events. People are using our “edit” button 50/hour for this, which is not what the button is intended for, and Reddit’s not even very good at. People have created a site that reorganizes Reddit in chronological order and they can do that because we’re open source and don’t send lawyers after them. If we evolve in 5 yrs it will be because people in the community take it in those new directions.
Q: Venture capitalists?
A: Y-Combinator’s original investment was $20K. We were self-sustaining until Conde Nast bought us. We also had a very small angel round in the past year, around $1M. Very small. We’ve never spent a lot of money so we’ve never had to raise a lot. We’re close to break even now.
Q: Have any news events truly originated with Reddit?
A: As far as I know, one of the first reports on the Aurora story was from someone at the theater, before there was anything known to the media. The biggest story where Reddit was involved in the story was probably the SOPA/PIPA blackouts. Someone started to go after GoDaddy: “I’m moving 75 domains from GoDaddy” and it grew, and the next day GoDaddy flipped its position. Also, someone went after Paul Ryan and he ended up changing his mind.
Q: How can I troll Reddit for news stories?
A: When a new Android comes out, reporters go to Reddit to see what’s new in that version. I don’t know why more reporters don’t go to the relevant subreddits and ask for help on a story.
Q: We reporters are competitive.
A: In the sports world, you routinely see stories getting updated based upon information at Reddit.
Q: News orgs are trying to figure out how to engage with their audiences via social media. Advice?
A: Popular Science killed comments. Fine. You don’t have to have comments. But if you have them, you should pay attention to them. E.g., Roger Ebert would edit your comment as an admin, which is a terrible practice, but people didn’t mind because he was doing so to respond to their comments. I don’t understand why in general comments in 2013 are not all threaded and vote-able. Most are still in reverse chron, highlighting the latest. And most seem to be trying to hide their comments.
, too big to know
Tagged with: 2b2k
Date: October 10th, 2013 dw
Popular Science has announced that it’s shutting down comments on its articles. The post by Suzanne LeBarre says this is because ” trolls and spambots” have overwhelmed the useful comments. But what I hear instead is: “We don’t know how to run a comment board, so shut up.”
Suzanne cites research that suggests that negative comments on an article reduce the credibility of the article, even if those negative comments are entirely unfounded. Thus, the trolls don’t just ruin the conversation, they hurt the cause of science.
Ok, let’s accept that. Scientific American cited the same research but came to a different decision. Rather than shut down its comments, it decided to moderate them using some sensible rules designed to encourage useful conversation. Their idea of a “useful conversation” is likely quite similar to Popular Science’s: not only no spam, but the discourse must be within the norms of science. So, it doesn’t matter how loudly Jesus told you that there is no climate change going on, your message is going to be removed if it doesn’t argue for your views within the evidentiary rules of science.
You may not like this restriction at Scientific American. Tough. You have lots of others places you can talk about Jesus’ beliefs about climate change. I posted at length about the Scientific American decision at the time, and especially about why this makes clear problems with the “echo chamber” meme, but I fundamentally agree with it.
If comments aren’t working on your site, then it’s your fault. Fix your site.
[Tip o' the hat to Joshua Beckerman for pointing out the PopSci post.]
Tagged with: 2b2k
Date: September 27th, 2013 dw
I am a big fan of Reddit, as a reader, an occasional participant, and as an observer. As a reader, Reddit has gone downhill for me. Or perhaps I should say “as a lazy reader.” I don’t stray much from the home page which shows the top posts from a default set of sub-reddits, i.e., topically clustered posts. These days, there’s usual one post among the 25 on the home page that I find interesting in a way that matters, although maybe a half dozen I find click-worthy. Those half dozen are usually memes, or discussions of something in pop or Internet culture. The one in 25 that matters to me introduces me to an idea I hadn’t considered, with a discussion that goes pretty deeply into it — while always laced with glancing sub-threads and banter. But for a page that can be quickly skimmed, a 1:25 ratio is enough to bring me back several times a day.
One in 25 is probably about the ratio I find in The New York Times when I come upon a printed copy of it. That ratio goes higher if you count the sections that I skip entirely. For example, I apparently entirely lack the sports gene. The articles I read are usually ones that offer an interesting viewpoint on a topic I already care about, or that for some unpredictable reason stimulate my interest in something I didn’t know I cared about. I know this is very different from the behavior I’m supposed to exhibit. As a responsible citizen, I should be reading all the articles the paper tells me are important. But that’s how I am, that’s how I’ve always been, and I think it’s the way that most of us were even during the decades when reading the newspaper every day was our civic duty.
So, it worries me that Jeff Bezos may bring to the Washington Post the theory of reading that he has brought to Amazon. Amazon’s personalization works very well for me. The books it suggests are often in fact very appealing to me. It’s one reason I keep going back to Amazon. The suggestions don’t often take me far afield, but books are such a big investment of time and money that I don’t intuitively react against that. Intellectually I react against it, but my intuition and the finger that clicks the “buy” button don’t seem to mind at all.
Besides, I read most books as a matter of recreation. (Actually, that’s entirely false. In terms of numbers, I read most books as research that’s dictated by whatever project I’m working on. But we’re talking here about discretionary reading.) And here the Washington Post is different. We need it to help us learn what we need to know to be better citizens in a world that is increasingly inhospitable. A newspaper that works like Amazon would be intentionally creating a filter bubble, in Eli Pariser’s phrase. (And Eli Pariser’s book by that name is thoroughly worth reading, especially if you follow it up with Ethan Zuckerman’s Rewire.)
Bezos has a tremendous opportunity with the Washington Post. He can choose to restructure it so that it becomes the first truly networked newspaper, retaining the traditional virtues of a great newspaper while opening it up to the new virtues of our global participatory network. It can become a uniquely well-webbed supplier of news to the networked ecology, although the idea that any newspaper can “cover” all the “major” news has long ago gone pining for the fjords.
But this new webby news platform will miss the big chance to improve the ecosystem if Bezos applies to the Washington Post what he knows about personalization. The world doesn’t need another way to have our beliefs confirmed and our interests titilated. We don’t need The Daily Everyone Sucks But Us, and we really really don’t need The Washington Post and Sideboob.
What we instead need is personalization that doesn’t pander to our interests but expands them. That requires starting from where we are; posting lots of articles that are so outside our interests that we won’t read them won’t help. But the genius of Amazon’s personalization can be tuned so that we are presented with what pushes our interests forward without abandoning them. There’s lots of room for improvement in my current 1:25 ratio. In fact, there’s a statistical possibility of a 24x improvement.
We have billions of dollars’ worth of evidence that Jeff Bezos is one of the great business entrepreneurs of our era. But we also have good evidence that he has interests beyond maximizing corporate value. His taking the Washington Post private is a very good sign. I’m hopeful that something very good for us all is going to come out of his purchase — but only if Bezos can unlearn much of what Amazon has taught him about how to succeed.
Tagged with: 2b2k
Date: August 9th, 2013 dw
A judge has ruled that Apple is guilty of price-fixing in its attempt to get the major publishers to unite against Amazon’s discounting of e-books.
Now, that’s not a very helpful — and possibly not entirely accurate — explanation. If you want more, there’s a thread at Reddit that has some terrific explanations at various level of detail (e.g., this one), as well as bunches of questions asked and answered. And, of course, some digressions, hip shots, and smug wrongnesses.
There are certainly some helpful analyses and explanations from the mainstream: e.g., WSJ, Wired, Bloomberg. In fact, I’d be hard-pressed to choose among those three and the Reddit comment I linked to above. But the Reddit thread is — at least to my taste — a better way to explore the issue: a variety of views expressed at appropriate lengths, with questions posed at various levels of sophistication, and with a conversation that goes where it wants to without a fear of dead ends.
Now, I’m aware that if you go to the Reddit thread, you’ll be appalled by how much there is wrong with it. Yeah, I’m not blind to it. But consider what an amazing emergent artifact that thread is. It combines in one flow “explainers” and analysis as good as you’ll find from professionals, Q&A, and a a social froth that you can easily ignore if it is not to your liking. This is what journalism looks like — one of the ways it looks — when the old constraints of space, authorial ownership, and editorial process are lifted, and a larger We gets our hands on it. Pretty fascinating.
Tagged with: 2b2k
Date: July 11th, 2013 dw
A few days ago there was a Twitter back and forth between two people I deeply respect: Dan Brickley [twitter:danbri] and Ed Summers [twitter:edsu]. It started with Ed responding to a tweet about a brief podcast I did with Kevin Ford [twitter:3windmills], who is on the team working on BibFrame:
After a couple of tweets, Dan tweeted the following:
There followed some agreement that it's often helpful to have apps driving the development of standards. (Kevin agrees with this, and points to BibFrame's process.) But, Dan's comment clarified my understanding of why ontologies make me nervous.
Over the past hundred years or so, we've come to a general recognition that all classifications and categorizations are tools, not representations of The Real Order. The periodic table of the elements is a useful way of organizing information, and manifests real relationships among the elements, but it is not the single "real" way the elements are arranged; if you're an economist or an industrialist, a chart that arranges the elements based on where they exist on our planet might be just as valid. Likewise, Linneaus' classification scheme is useful and manifests some real relationships, but if you're a chef you might have a different way of carving up the animal kingdom. Linneaus chose to organize species based upon visible differences — which might not be the "essential" differences — so that his scheme would be useful to scientists in the field. Although he was sometimes ambiguous about this, he seems not to have thought that he was discerning God's own order. Since Linnaeus we have become much more explicit in our understanding that how we classify depends on what we're trying to accomplish.
For example, a DTD (document type definition) typically is designed not to capture the eternal essence of some type of document, but to make the document more usable by systems that automate the document's production and processing. For example, an industry might agree on a DTD for parts catalogs that specifies that a parts catalog must have an element called "part" and that a part must have a type, part number, length, height, weight, material, and a description, and optionally can note whether it turns clockwise or counterclockwise. Each of these elements would have a standard name (e.g., "part_number," not "part#"). The result is a document that describes parts in a standard way so that a company can receive descriptions from all of its suppliers and automatically build a database of the parts it uses.
A DTD therefore is designed with an eye toward what properties are going to be useful. In some industries, it might include a term that captures how shiny the part is, but if it's a DTD for surgical equipment, that may not be relevant enough to include...although "sanitary_packaging" might be. Likewise, how quickly a bolt transfers heat might seem irrelevant, at least until NASA places an order. In this DTD's are much like forms: You don't put a field for earlobe length in the college application form you're designing.
Ontologies are different. They can try to express the structure of a domain independent of any particular use, so that the widest variety of applications can share data, including apps from domains outside of the one that's been mapped. So, to use Dan's example, your ontology of jobs would note that jobs have employers and workers, that they may have a salary or other form of compensation, that they can be part-time, full-time, seasonal, etc. As an ontology designer, because you're trying to think beyond whatever applications you already can imagine, your aim (often, not always) is to provide the fullest possible set of slots just in case someone sometime needs that info. And you will carefully describe the relationships among the elements so that apps and researchers can use knowledge that is implicit in the model.
The line between DTD's and ontologies is fuzzy. Many ontologies are designed with classes of apps in mind, and some DTD's have tried to be hugely general purpose. My discomfort really comes down to a distrust of the concept of "knowledge representation" that underlies some ontologies (especially earlier ones). The complexity of the relationships among parts will always outstrip our attempts to capture and codify those relationships. Further, knowledge cannot be fully represented because it isn't a thing apart from our continuous invention, discovery, and engagement with it.
What it comes down to is that if you talk about ontologies as knowledge representations I'll mutter something under my breath and change the topic.
Tagged with: 2b2k
Date: July 6th, 2013 dw
I gave a 20 minute talk at the Wired Next Fest in Milan on June 1, 2013. Because I needed to keep the talk to its allotted time and because it was being simultaneously translated into Italian, I wrote it out and gave a copy to the translators. Inevitably, I veered from the script a bit, but not all that much. What follows is the script with the veerings that I can remember. The paragraph breaks track to the slide changes
(I began by thanking the festival, and my progressive Italian publisher, Codice Edizioni Codice are pragmatic idealists and have been fantastic to work with.)
Knowledge seems to fit so perfectly into books. But to marvel at how well Knowledge fits into books…
… is to marvel at how well each rock fits into its hole in the ground. Knowledge fits books because we’ve shaped knowledge around books and paper.
And knowledge has taken on the properties of books and paper. Like books, knowledge is ordered and orderly. It is bounded, just as books stretch from cover to cover. It is the product of an individual mind that then is filtered. It is kept private and we’re not responsible for it until it’s published. Once published, it cannot be undone. It creates a privileged class of experts, like the privileged books that are chosen to be published and then chosen to be in a library
Released from the bounds of paper, knowledge takes on the shape of its new medium, the Internet. It takes on the properties of its new medium just it had taken on the properties of its old paper medium. It’s my argument today that networked knowledge assumes a more natural shape. Here are some of the properties of new, networked knowledge
1. First, because it’s a network, it’s linked.
2. These links have no natural stopping point for your travels. If anything, the network gives you temptations to continue, not stopping points.
3. And, like the Net, it’s too big for any one head, Michael Nielsen, the author of Reinventing Discovery, uses the discovery of the Higgs Boson as an example. That discovery required gigantic networks of equipment and vast networks of people. There is no one person who understands everything about the system that proved that that particle exists. That knowledge lives in the system, in the network.
4. Like the net, networked knowledge is in perpetual disagreement. There is nothing about which everyone agrees. We like to believe this is a temporary state, but after thousands of years of recorded history, we can now see for sure that we are never going to agree about anything. The hope for networked knoweldge is that we’re learning to disagree more fruitfully, in a linked environment
5. And, as the Internet makes very clear, we are fallible creatures. We get everything wrong. So, networked knowledge becomes more credible when it acknowledges fallibility. This is very different from the old paper based authorities who saw fallibility as a challenge to their authority.
6. Finally, knowledge is taking on the humor of the Internet. We’re on the Internet voluntarily and freed of the constrictions of paper, it turns out that we like being with one another. Even when the topic is serious like this topic at Reddit [a discussion of a physics headline], within a few comments, we’re making jokes. And then going back to the serious topic. Paper squeezed the humor out of knowledge. But that’s unnatural.
These properties of networked knowledge are also properties of the Network. But they’re also properties that are more human and more natural than the properties of traditional knowledge.
But there’s one problem:
There is no such thing as natural knowledge. Knowledge is a construct. Our medium may have changed, but we haven’t, at least so it seems. And so we’re not free to reinvent knowledge any way we’d like. Significant problems based on human tendencies are emerging. I’ll point to four quick problem areas.
First, We see the old patterns of concentration of power reemerge on the Net. Some sites have an enormous number of viewers, but the vast majority of sites have very few. [Slide shows Clay Shirky’s Power Law distribution chart, and a photo of Clay]
Albert-László Barabási has shown that this type of clustering is typical of networks even in nature, and it is certainly true of the Internet
Second, on the Internet, without paper to anchor it, knowledge often loses its context. A tweet…
Slips free into the wild…
It gets retweeted and perhaps loses its author
And then gets retweeted and lose its meaning. And now it circulates as fact. [My example was a tweet about the government not allowing us to sell body parts morphing into a tweet about the government selling body parts. I made it up.]
Third, the Internet provides an incentive to overstate.
Fourth, even though the Net contains lots of different sorts of people and ideas and thus should be making us more open in our beliefs…
… we tend to hang out with people who are like us. It’s a natural human thing to prefer people “like us,” or “people we’re comfortable with.” And this leads to confirmation bias — our existing beliefs get reinforced — and possibly to polarization, in which our beliefs become more extreme.
This is known as the echo chamber problem, and it’s a real problem. I personally think it’s been overstated, but it is definitely there.
So there are four problems with networked knowledge. Not one of them is new. Each has a analog from before the Net.
The loss of context has always been with us. Most of what we believe we believe because we believe it, not because of evidence. At its best we call it, in English, common sense. But history has shown us that common sense can include absurdities and lead to great injustices.
Yes, the Net is not a flat, totally equal place. But it is far less centralized than the old media were, where only a handful of people were allowed to broadcast their ideas and to choose which ideas were broadcast.
Certainly the Internet tends towards overstatement. But we have had mass media that have been built on running over-stated headlines. This newspaper [Weekly World News] is a humor paper, but it’s hard to distinguish from serious broadcast news.
And speaking of Fox, yes, on the Internet we can simply stick with ideas that we already agree with, and get more confirmed in our beliefs. But that too is nothing new. The old media actually were able to put us into even more tightly controlled echo chambers. We are more likely to run into opposing ideas — and even just to recognize that there are opposing ideas — on the Net than in a rightwing or leftwing newspaper.
It’s not simply that all the old problems with knowledge have reemerged. Rather, they’ve re-emerged in an environment that offers new and sometimes quite substantial ways around them.
For example, if something loses its context, we can search for that context. And links often add context.
And, yes, the Net forms hubs, but as Clay Shirky and Chris Anderson have pointed out, the Net also lets a long tail form, so that voices that in the past simply could not have been heard, now can be. And the activity in that long tail surpasses the attention paid to the head of the tail.
Yes, we often tend to overstate things on the Net, but we also have a set of quite powerful tools for pushing back. We review our reviews. We have sites like the well-regarded American site, Snopes.com, that will tell you if some Internet rumor is true. Snopes is highly reliable. Then we have all of the ways we talk with one another on the Net, evaluating the truth of what we’ve read there.
And, the echo chamber is a real danger, but we also have on the Net the occasional fulfillment of our old ideal of being able to have honest, respectful conversations with people with whom we fundamentally disagree. These examples are from Reddit, but there are others.
So, yes, there are problems of knowledge that persist even when our technology of knowledge changes. That’s because these are not technical problems so much as human problems…
…and thus require human solutions. And the fundamental solution is that we need to become more self-aware about knowledge.
Our old technology — paper — gave us an idea of knowledge that said that knowledge comes from experts who are filtered, printed, and then it’s settled, because that’s how books work. Our new technology shows us we are complicit in knowing. In order to let knowledge get as big as our new medium allows, we have to recognize that knowledge comes from all of us (including experts), it is to be linked, shared, discussed, argued about, made fun of, and is never finished and done. It is thoroughly ours – something we build together, not a product manufactured by unknown experts and delivered to us as if it were more than merely human.
The required human solution therefore is to accept our human responsibility for knowledge, to embrace and improve the technology that gives knowledge to us –- for example, by embracing Open Access and the culture of linking and of the Net, and to be explicit about these values.
Becoming explicit is vital because our old medium of knowledge did its best to hide the human qualities of knowledge. Our new medium makes that responsibility inescapable. With the crumbling of the paper authorities, it bcomes more urgent than ever that we assume personal and social responsibility for what we know.
Knowing is an unnatural act. If we can remember that –- remember the human role in knowing — we now have the tools and connections that will enable even everyday knowledge to scale to a dimension envisioned in the past only by the mad and the God-inspired.
Greg Silverman [twitter:concentricabm], the CEO of Concentric, has a good post at CMS Wire about the democratization of market analysis. He makes what seems to me to be a true and important point: market researchers now have the tools to enable them to slice, dice, deconstruct, and otherly-construct data without having to rely upon centralized (and expensive) analytics firms. This, says Greg, changes not only the economics of research, but also the nature of the results:
The marketers’ relationships with their analytics providers are currently strained as a service-based, methodologically undisclosed and one-off delivery of insights. These providers and methods are pitted against a new generation of managers and executives who are “data natives” —professionals who rose to the top by having full control of their answering techniques, who like to be empowered and in charge of their own destinies, and who understand the world as a continuous, adaptive place that may have constantly changing answers. This new generation of leaders likes to identify tradeoffs and understand the “grayness” of insight rather than the clarity being marketed by the service providers.
He goes on to make an important point about the perils of optimization, which is what attracted the attention of Eric Bonabeau [twitter:bonabeau], whose tweet pointed me at the post.
The article’s first point, though, is interesting from the point of view of the networking of knowledge, because it’s not an example of the networking of knowledge. This new generation of market researchers are not relying on experts from the Central Authority, they are not looking for simple answers, and they’re comfortable with ambiguity, all of which are characteristics of networked knowledge. But, at least according to Greg’s post, they are not engaging with one another across company boundaries, sharing data, models, and insights. I’m going to guess that Greg would agree that there’s more of that going on than before. But not enough.
If the competitive interests of businesses are going to keep their researchers from sharing ideas and information in vigorous conversations with their peers and others, then businesses simply won’t be as smart as they could be. Openness optimizes knowledge system-wide, but by definition it doesn’t concentrate knowledge in the hands of a few. And this may form an inherent limit on how smart businesses can become.
Tagged with: 2b2k
Date: May 30th, 2013 dw
Amanda Alvarez has a provocative post at GigaOm:
There’s an epidemic going on in science: experiments that no one can reproduce, studies that have to be retracted, and the emergence of a lurking data reliability iceberg. The hunger for ever more novel and high-impact results that could lead to that coveted paper in a top-tier journal like Nature or Science is not dissimilar to the clickbait headlines and obsession with pageviews we see in modern journalism.
The article’s title points especially to “dodgy data,” and the item in this list that’s by far the most interesting to me is the “data reliability iceberg,” and its tie to the rise of Big Data. Amanda writes:
…unlike in science…, in big data accuracy is not as much of an issue. As my colleague Derrick Harris points out, for big data scientists the abilty to churn through huge amounts of data very quickly is actually more important than complete accuracy. One reason for this is that they’re not dealing with, say, life-saving drug treatments, but with things like targeted advertising, where you don’t have to be 100 percent accurate. Big data scientists would rather be pointed in the right general direction faster — and course-correct as they go – than have to wait to be pointed in the exact right direction. This kind of error-tolerance has insidiously crept into science, too.
But, the rest of the article contains no evidence that the last sentence’s claim is true because of the rise of Big Data. In fact, even if we accept that science is facing a crisis of reliability, the article doesn’t pin this on an “iceberg” of bad data. Rather, it seems to be a melange of bad data, faulty software, unreliable equipment, poor methodology, undue haste, and o’erweening ambition.
The last part of the article draws some of the heat out of the initial paragraphs. For example: “Some see the phenomenon not as an epidemic but as a rash, a sign that the research ecosystem is getting healthier and more transparent.” It makes the headline and the first part seem a bit overstated — not unusual for a blog post (not that I would ever do such a thing!) but at best ironic given this post’s topic.
I remain interested in Amanda’s hypothesis. Is science getting sloppier with data?
, too big to know
Tagged with: 2b2k
• big data
Date: May 26th, 2013 dw
« Previous Page | Next Page »