Joho the Blog » liveblog

October 16, 2012

[eim] [semtechbiz] Viacom’s semantic approach

I’m at the Semantic Technology & Business conference in NYC. Matthew Degel, Senior Vice President and Chief Architect at Viacom Media Networks is talking about “Modeling Media and the Content Supply Chain Using Semantic Technologies.” [NOTE: Liveblogging. Getting things wrong. Mangling words. Missing points. Over- and under-emphasizing the wrong things. Not running a spellpchecker. You are warned!]

Matthew says that the problem is that we’re “drowning in data but starved for information” Tere is a “thirst for asset-centric views.” And of course, Viacom needs to “more deeply integrate how property rights attach to assets.” And everything has to be natively local, all around the world.

Viacom has to model the content supply chain in a holistic way. So, how to structure the data? To answer, they need to know what the questions are. Data always has some structure. The question is how volatile those structures are. [I missed about 5 mins m-- had to duck out.]

He shows an asset tree, “relating things that are different yet the same,” with SpongeBob as his example: TV series, characters, the talent, the movie, consumer products, etc. Stations are not allowed to air a commercial with the voice actor behind Spoongey, Tom Kenney, during the showing of the SpongeBob show, so they need to intersect those datasets. Likewise, the video clip you see on your setup box’s guide is separate from, but related to, the original. For doing all this, Viacom is relying on inferences: A prime time version of a Jersey Shore episode, which has had the bad language censored out of it, is a version of the full episode, which is part of the series which has licensing contracts within various geographies, etc. From this Viacom can infer that the censored episode is shown in some geography under some licensing agreements, etc.

“We’ve tried to take a realistic approach to this.” As excited as they are about the promise, “we haven’t dived in with a huge amount of resources.” They’re solving immediate problems. They began by making diagrams of all of the apps and technologies. It was a mess. So, they extracted and encoded into a triplestore all the info in the diagram. Then they overlaid the DR data. [I don't know what DR stands for. I'm guessing the D stands for Digital, and the R might be Resource]] Further mapping showed that some apps that they weren’t paying much attention to were actually critical to multiple systems. They did an ontology graph as a London Underground map. [By the way, Gombrich has a wonderful history and appreciation of those maps in Art and Representation, I believe.]

What’s worked? They’re focusing on where they’re going, not where they’ve been. This has let them “jettison a lot of intellectual baggage” so that they can model business processes “in a much cleaner and effective way.” Also, OWL has provided a rich modeling language for expressing their Enterprise Information Model.

What hasn’t worked?

  • “The toolsets really aren’t quite there yet.” He says that based on the conversations he’s had to today, he doesn’t think anyone disagrees with him.

  • Also, the modeling tools presume you already know the technology and the approach. Also, the query tools presume you have a user at a keyboard rather than as a backend of a Web service capable of handling sufficient volume. For example, he’d like “Crystal Reports for SPARQL,” as an example of a usable tool.

  • Visualization tools are focused on interactive use. You pick a class and see the relationships, etc. But if you want to see a traditional ERD diagram, you can’t.

  • Also, the modeling tools present a “forward-bias.” E.g., there are tools for turning schemas into ontologies, but not for turning ontologies into a reference model for schema.

Matthew makes some predictions:

  • They will develop into robust tools

  • Semantic tech will enable queries such as “Show me all Madonna interviews where she sings, where the footage has not been previously shown, and where we have the license to distribute it on the Web in Australia in Dec.”

Be the first to comment »

October 11, 2012

[dpla] DPLA afternoon session

It’s the end of the workstream day of the DPLA Midwest meeting. Each of the three workstream meetings is reporting back to the general group.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.


Emily Gore from the Content stream: What kind of guidance should we develop for interested content providers? The group wants to have a strategic collection development plan draft by the end of December. “What is our role with regard to advocacy” for content currently under copyright? Also, the group talked about the hub pilot project. Various participants in that pilot were in the room.


SJ Klein from the Technical workstream: There was a lively discussion this afternoon, primarily about the design of the front end. How to make the frontend experience help people become contributors? They also talked about the Chatanooga hackathon Nov. 8-9. Tools for making working with metadata easier. Packaging tools that match potential contributors with a hub. Metadata purgatory for metadata that has been contributed but doesn’t meet the standards.


Maureen Sullivan and John Palfrey report on the Governance group: The next steps are to take the barebone by-laws and flesh them out. There were many discussions about whether DPLA the 501(c)(3) should be a membership organization, but the general consensus is no. (Paul Courant made the point that many institutions shy away from becoming members because that makes them liable.) Rather, it would be good to have participation from groups and people with specific areas of expertise. There was a lot of energy about expanding on the statement of principles, including adding an explicit commitment to accessibility. There was strong support for continuing to see the DPLA as a public/private enterprise. John Bracken made the point that DPLA should view itself as a network, not as a heavyweight organization.


Maureen points out that the workstreams have converged. She says that “contributor” seems to be a better word than “member.” We need to be flexible about how people will come together to do the work that’s required. And we should be thinking of ourselves as advocates, a force for change to improve the lives of people in this country and around the world.

3 Comments »

October 1, 2012

[sogeti] Andrew Keen on Vertigo and Big Data

Andrew Keen is speaking. (I liveblogged him this spring when he talked at a Sogeti conference.) His talk’s title: “How today’s online social revolution is dividing, diminishing, and disorienting us.” [Note: Posted without rereading because I'm about to talk. I may go back and do some cleanup.]

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Andrew opens with an anecdote. He grew up as a Jew in Britain. His siblings were split between becoming lawyers or doctors. But his mother asked him if he’d like to be the anti-Christ. So, now he’s grown up to become the anti-Christ of Silicon Valley.

“I’m not usually into intimacy,” but look at each other. How much do we know about each other? Not much. One of the great joys is getting to know one another. By 2017 there will 15x more data flowing over the network. Billions of intelligent devices. “The world we are going into is one in which 2o-25 years…you strangers will show up in a big city in London and you’ll know everything about each other.” You’ll know one another’s histories, interests…

“My argument is that we’re all stuck in Digital Vertigo. We’re all participants in a digital noir.” He shows a clip from Vertigo. “In the future these kinds of scenes won’t be possible. There won’t be private detectives…So this movie about the unfolding of understanding between strangers won’t happen.” What happens to policing. “Will we be guilty if we don’t carry our devices.” [SPOILERS] The blonde in this movie doesn’t exist. She’s a brunette shopgirl from Kansas. “The movie is about a deception…A classic Hitchcock narrative of falling in love with something that doesn’t exist. A good Catholic narrative…It’s a warning about falling in love with something that is too good to be true.” That’s what we’re doing with social media nd big data. We’re told big data brings us together. They tell us the Net gives us the opportunity for human beings to come together, to realize themselves as social beings. Big data allows us to become human.

This is about more than the Net. The revolution that Carlotta is talking about is one in which the Net becomes central in the way we live our lives. Fifteen years ago, Doc Searls, David W., and I would be marginal computer nerds, and now our books can be found in any book store. [Doc is in the audience also.]

He shows a clip from The Social Network: “We lived on farms. Now we’re going to live on the Internet.” It’s the platform of 21st century life. This is not a marginal or media issue. It is about the future of society. Many people this network will solve the core problems of life. We now have an ecosystem of apps in the business of eliminating loneliness. E.g., Highlight, “the darling of the recent SxSW show.” They say it’s “a fun way to learn more about people nearby.” Then he shows a clip from The Truman Show. His point: We’re all in our own Truman Shows. The destruction of privacy. No difference between public and private. We’re being authentic. We’re knowingly involving ourselves in this.

A quote: “Vertigo is the ultimate critics’ film because it is a dreamlike film about people who are not sure who they but who are busy econstructing themselves and each other to a f=kind of cinema ideal of the ideal soul mate.” Substitute social media for film. We’re losing what it means to be using. We’re destroying the complexity of our inner lives. We’re only able to live externally. [This is what happens when your conceptual two poles are public and private. It changes when we introduce the term "social."]

Narcissism isn’t new. Digital narcissism has reached a climax. As we’re given personal broadcasting platforms, we’re increasingly deluded into thinking we’re interesting and important. Mostly it reveals our banality, our superficiality. [This is what you get when your conceptual poles are taken from broadcast media.]

It’s not just digital narcissism. “Visibility is a trap,” said Foucault. Hypervisibility is a hypertrap. Our data is central to Facebook and others becoming viable businesses. The issue is the business model. Data is oil, and it’s owned by the rich. Zuckerberg, Reed Hoffman, et al., are data barons. Read Susan Cain’s “Quiet”: introverts drive innovation. E.g., Steve Wozniak. Sharing is not good for innovation. Discourage your employees from talking with one another all the time. It makes them less thoughtful. It creates groupthink. If you want them to think for themselves, “take away their devices and put them in dark rooms.”

It’s also a trap when it comes to govt. Many govts are using the new tech to spy on their citizens. Cf. Bentham’s panopticon, which was corrupted into 1984 and industrial totalitarianism. We need to go back to the Industrial Age and JS Mill — Mill’s On Liberty is the best antidote to Bentham’s utilitarianism. [? I see more continuity than antidote.]

To build a civilized golden age: 1. There is a role for govt. The market needs regulation. 2. “I’m happy with the EU is working on this…and came out against FB facial recognition software. … We have a right to forget.” “It’s the most unhuman of things to remember everything.” “We shouldn’t idolize the never-forgetting nature of Big Data.” “To forget and forgive is the core essence of being human.” 3. We need better business models. We don’t want data to be the new oil. I want businesses that charge. “The free economy has been a catastrophe.”

He shows the end of The Truman Show. [SPOILER] As Truman enters reality, it’s a metaphor for our hope. We can only protect our humanness by retreating into dark, quiet places.

He finishes with a Vermeer that shows us a woman about which we know nothing. In our Age of Facebook, we need to build a world in which the woman in blue can read that letter, not reveal herself, not reveal her mystery…”

Q: You’re surprising optimistic today. In the movie Vertigo, there’s an inevitability. How about the inevitability of this social movement? Are you tilting at windmills.

Idealists tilt at windmills. People are coming to around to understanding that the world we’re collectively creating is not quite right. It’s making people uneasy. More and more books, articles, etc., that FB is deeply exploitative. We’re all like Jimmy Stewart in Vertigo. The majority of people in the world don’t want to give away their data. As more of the traditional world comes onto the Net, there will be more resistant to collapsing the private and the public. Our current path is not inevitable. Tech is religion. Tech is not autonomous, not a first mover. We created Big Data and need to reestablish our domination over it. I’m cautiously optimistic. But it could go wrong, especially in authoritarian regimes. In Silicon Valley people say privacy is dead, get over it. But privacy is essential. Once we live this public ideal, then who are we.

Be the first to comment »

[2b2k][sogeti] Big Data conference session

I’m at Sogeti‘s annual executive conference, which brings together about 80 CEOs. I’m here with Doc Searls, Andrew Keen, and others. I’ve spoken at other Sogeti events, and I am impressed with their commitment to providing contrary points of view — including views at odds with their own corporate interests. (My one complaint: They expect all attendees to have an iPad or iPhone so that they can participate in on the realtime survey. Bad symbolism.) (Disclosure: They’re paying me to speak. They are not paying me to say something nice about them.)

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Menno van Doorn begins by talking about the quantified self movement, claiming that they sometimes refer to themselves as “datasexuals” :) All part of Big Data, he says. To give us an idea of bigness, he relates the Legend of Sessa: “Give me grain, doubling the amount for each square on a chessboard.” Exponential growth meant that by the time you hit the second half of the chessboard, you’re in impossible numbers. Experts say that’s where we were in 2006 when it comes to data. But “there’s no such thing as too much data.” “Big Data is powering the next industrial revolution. Data is the new oil.”

Big Data is about (1) lots of data, (2) at high velocity, (3) using in a variety of ways. (“volume, velocity, variety.”) Michael Chui says that there’s billions in revenues to gain, including from efficiencies. But, Chui says, there are no best practices. The value comes from “human exhaust.” I.e., your digital footprint, what you leave behind in your movement through the Net. Menno thinks of this as “your recorded future.”

Three examples:

1. Menno points to Target, a company that can predict life-changing events among its customers. E.g., based on purchases of 25 products, they can predict which customers are pregnant and roughly when they are due. But, this led to Target sending promotional materials for pregnancy to young girls whose parents learned this way that their daughters were pregnant.

2. In SF, they send out police cars to neighborhoods based on 14-day predictions of where crime will occur, based on data about prior crime patterns.

3. Schufa, a German credit agency, announced they’d use social media to assess your credit worthiness. Immediately a German Minister said, “Schufa cannot become the Big Brother of the beusiness world.”

Two forces are in contention and will determine how much Big Data changes us. Today, the conference will look at the dawn of the age of big data, and then how disruptive it will be for society (the session Keen and I are in). Day 2: Bridging the gap to the new paradigm, Big Data’s fascinating future, and Decision Time: Taming Big Brother.

 


Carlota Perez, Prof. of Tech and Socio-Economic Development, from Venezuela speaks now.. She is a “neo-Schumpeterian.” She says her role in the conference is “locate the current crisis.” What is the real effect on innovation, and why are we only midways along in feeling the impact?

There have been 5 tech revolutions in the past 240 yeares: 1. 1771 Industrial rev. 1829. Age of steam, coal and railways. 3. 1875 Steel and heavy engineering (the first globalization). 4. Age of he automobile, oril, petrochem and mass production 5. 1971 Age of info tech and telecom. We’re only halfway through that last one. The next revolution queued up: age of biotech, bioelectronics, nanotech, and new materials. [I'm surprised she doesn't count telegrapgh + radio + telephone, etc., as a comms rev. And I'd separate the Net as its own rev. But that's me.]

Lifecycle of a tech rev: gestation, induction, deployment, exhaustion. The “big bang” tends to happen when the prior rev is reaching exhaustion. The structure of revs: new cheap inputs, new products, new processes. A new infrastructure arise. And a constellation of new dynamic industries that grow the world economy.

Why call these “revolutions”, she asks? Because they transform the whole economy. They bring new organizational principles and new best practice models. I.e. , a new “techno-economic paradigm.” E.g., we’ve gone from mass production to flexible production. Closed pyramids to open networks. Stable routines to continuous improvement. “Information technology finds change natural.” From human resources to human capital (from raw materials to value). Suppliers and clients to value network partners. Fixed plans to flexible strategies. Three-tier markets (big,medium,small) to hyper-segmented markets. Internationalization to globalization. Information as costly burden to info as asset. Together, these constitute a radical change in managerial common sense.

The diffusion process is broken in two: Bubble, followed by a crash, and then the Golden Age. During the bubble, financial capital forces diffusion. There is income and demand polarization. Then the crash. Then there is an institutional recomposition, leading to a golden age in which everyone benefits. Production capital takes over from financial capital (driven by the govt), and there is better distribution of income and demand.

She looks at the 5 revs, and finds the same historic pattern that she just sketched.

wo major differences between installation and deployment: 1. Bubbles vs. patient (= long-term) capital. 2. Concentrated innovation to modernize industries vs. innovation in all industries that use the new technologies. “Understanding this sequence is essential for strategic thinking.”

The structure of innovation in deployment: pa new coherent fabric of the economy emerges, leading to a golden age. Also, oligopolies emerge which means there’s less unhelpful competition. (?)

Example of prior rev: home electrical applicances: In the installation period, we had a bunch of electric utilities going into homes in the 1910s and 1930s. During the revision, we get a few more. But then in the 1950-70s. we get a surge of new applicances, including tape recorder, microwave, even the electric toothbrush. It’s enabled by universal electricity and driven by suburbinization. It’s the same pattern if you look at textile fibers, from rayon and acetate during instlation, to a huge number during deployment. E.g., structural and packaging plastics: installation brought bakelite, polystyrene and polyethylene, and then a flood of innovation during deployment. “The various systems of the ICT revolution will follow a similar sequence.” [Unless it follows the Tim Wu pattern of consolidation — e.g., everyone being required to use an iPad at a conference] During installation period, ICT was in constant supply push mode. Now must respond to demand pull. “The paradigm and its potential are now understood by all. Demand (in vol and nature) becomes the driving force.

This shifts the role of the CIO. To modernize a mature company, during installation you brought in an expert in modernization, articulating the hw and sw being pushed by the suppliers. During the deployment phase, a modern company that is innovating for strategic expansion, the CIO is an expert in strategy, specifying needs and working with suppliers. “The CIO is no longer staff. S/he must be directly involved in strategy.”

There are 3 main forces for innovation in the next 2-3 decades, as is true for all the revs. 1. Deepening and widening of the ICT tech rev, responding to user needs. 2. The users of ICT across all industries and activities. 3. The gestation of the next rev (probably bioteech, nanotech, and new materials).

Big Data is likely have a big role in each of those directions.

Q: Why are we only 50% of the way through?

A: Because the change after the recession is like opening a dam. Once you get to the point where you can have a comfortable innovation prospective, imagine the market possibilities.

Q: What can go wrong?

A: Governments. Unfettered free markets are indispensable for the installation process. Lightly guided markets are needed in the golden age. Free markets work when you need to force everyone to change. But now no longer: The state has to come in . But govts are drunk with free markets. Now finance is incompetent. “They don’t dare invest in real things.” Ideology is so strong and the understanding of history is so shallow that we’re not doing the right thing.”

 


Christopher Ahlberg speaks now. He’s the founder of Recorded Future. His topic: “Turning the Web into Predictive Signals.”

We see events like Arab Spring and wonder if we could have predicted them. Three things are going on: 1. Moving from smaller to larger datasets. 2. From structured to unstructured data (from numbers to text). 3. From corporate data to Internet/Web.

There’s a “seismic shift in intelligence” “emporal indexing of the Web enables Web intelligence.” The Web is not organized for finding date; it’s about finding documents.” Can we create structure for the Web we can use for analysis? A lot of work has been done on this. Why is this possible now? Fast math, large, fast storage, web harvesting, and linguistic analysis progress.

His company looks for signals in human language. E.g., temporal signals. That can turn up competitive info. But human language is tough to deal with. But also when something happens — e.g., Haitian earthquake — there are patterns in when people show up: helpers, doctors, military, do-gooder actors, etc. There tends to be a flood of notifications immediately afterwards. The Recorded Data platform does the linguistic analysis.

He gives an example: What’s going to happen to Merck over the next 90 days. Some is predictable: There will be a quarterly financial conference all. A key drug is up for approval. Can we look into the public conversations about these events, and might this guide our stock purchases? And beyond Merck, we could look at everything from cyber attacks to sales opportunities.

Some examples. 1. Monitoring unrest. Last week there were protests against Foxconn in China. Analysis of Chinese media shows that most of those protests were inland, while corporate expansion is coming in coastal areas. Or look at protests against pharmaceuticals for animal testing.

Example 2: Analyzing cyber threats. Hackers often try out an approach on a small scale and then go larger. This can give us warning.

Example 3: Competitive intelligence. When is there a free space — announcement-free — when you can get some attention. Example 4: Lead generation. E.g., look for changes in management. (New marketing person might need a new PR agency.) Exasmple 5: Trading patterns. E.g., if there’s bad news but insiders are buying.

Conclusion: As we move from small to large datasets, structured to unstructured, and from inside to outside the company, we go from surprise to foresight.

Q: What is the question you cannot answer?

A: The situations that have low frequency. It’s important that there be an opportunity for follow-up questions.

Q: What if you don’t know what the right question is?

A: When it’s unknown unknowns, you can’t ask the right question. But the great thing about visualizaton is that it helps people ask questions.

Q: How to distinguish fact from opinion on Twitter, etc.?

A: Or NYT vs. Financial Post. There isn’t a simple answer. We’re working toward being able to judge sources based on known outcomes.

Q: Do your predictions get more accurate the more data you have?

A: Generally yes, but it’s not always that simple.

Be the first to comment »

July 24, 2012

[preserve] Lightning Talks

A series of 5-min lightning talks.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Christie Moffatt of the National Library of Medicine talks about a project collecting blogs talking about health. It began in 2011. The aim is to understand Web archiving processes and how this could be expanded. Three examples: Wheelchair Kamikaze. Butter Compartment. Doctor David’s Blog. They were able to capture them pretty well, but with links to outside, outside of scope content, and content protected by passwords, there’s a question about what it means to “capture” a blog. The project has shown the importance of test crawls, and attending to the scope, crawling frequency and duration. The big question is which blogs they capture. Doctors who cook? Surgeons who quilt? Other issues: Permissions. Monitoring when the blogs end, change focus, or move to a new url. E.g., a doctor retired and his blog changed focus to about fishing.

Terry Plum from Simmons GSLIS talks about a digital curriculum lab. It was set up to pull in students and faculty around a few different areas. They maintain a collection of open source applications for archives, museums, and digital libraries. There are a variety of teaching aids. The DCL is built into a Cultural Heritage Informatics track at Simmons.

Daniel Krech of Library of Congress works at the Repository Development Center. The RDC works with people managing collections. The RDC works on human-machine interfaces. One project involves “sets” (collections). “We’ve come up with some new and interesting ways to think about data.” They use knot, set, and hyper theory, but they also sometimes use a physical instantiation of a set — it looks like knotted yarn — to help understand some very abstract ideas.

Kelsey [Keley?]Shepherd of Amherst represents the Five College Digital Task Force. (She begins by denying that the Scooby Gang was based on the five colleges.) They don’t share a digital library but want to collaborate on digital preservation. They are creating shared guidelines for preservation-ready digital objects. They are exploring models for funding and organizational structure. And they are collaborating on implementing a trusted digital perservation repository. But each develops its own digital preservation policy.

Jefferson Baily talks about Personal Digital Archiving at the Library of Congress. He talks about the source diary for The Widwife’s Tale. That diary sat on a shelf for 200 years before being discovered as an invaluable window on the past. Often these archives are the responsibility of the record creators. The LoC therefore wants to support community archives, enthusiasts, and citizen archivists. They are out and about, promoting this. See digitalpreservation.gov

Carol Minton Morris with DuraSpace and the NDSA (National Digital Stewardship Alliance) talks about funding archiving through “hip pocket resources.” They’re looking into Kickstarter.com. Technology and publishing projects at Kickstarter have only raised $9M out of the $100M raised there; most of it goes to the arts. She points to some other microfinance sites, including IndieGoGo and DonorsChoose.org. She encourages the audience to look into microfinancing.

Kristopher Nelson from LoC Office of Strategic Initiatives talks about the National Digitial Stewardship Residency, which aims at building a community of professionals who will advance digital archiving. It wants to bridge classroom education and professional experience, and some real world experience. It will start in June 2013 with 10 residents participating in the 9 month program.

Moryma Aydelott, program specialist at LoC talks about Tackling Tangible Metadata. The LoC’s digital data is on lots of media: 300T on everything from DVDs to DAT tapes and Zip disks. Her group provides a generic workflow for dealing with this stuff — any division, any medium. They have a wheeling cart for getting at this data. They make the data available “as is.” It can be hard to figure out what type of file it is, and what application is needed to read it. Right now, it’s about getting it on the server. They’ve done about 6.5T of material, 700-800 titles, so far. But the big step forward is in training and in documenting processes.

Be the first to comment »

[preserve] Anil Dash on archiving the Internet

Anil Dash (one of my heroes, and is also hilarious) is talking at a Library of Congress event on Digital Preservation, part of the National Digital Information Infrastructure and Preservation Program. Anil’s talk is called “Make a Copy.” (Anil is now at ThinkUp.)

Live Blogging

Getting things wrong. Making fluid talks sound choppy. Missing important points. Not running a spellpchecker. This is not a reliable report. You have been warned, people!

Anil says he’s a geek interested in the social impacts of tech on culture, govt, and more. He started Expert Labs a few years ago to enable tech to talk with policy makers. Expert Labs built ThinkUp. He wants to talk about the issues that this group or archivists confronts every day that the tech community doesn’t know about. He warns us that this means he’s starting with depressing stuff. So…

…Picture the wholesale destruction of your wedding photos, or other deeply personal mementos. They are being destroyed by an exclusive, private, ivy league club: Facebook. FB treats memories as disposable. “Maybe if I were a 25 year old billionaire, I’d think of these as disposable, too.” “The terms of service of digital social networks trumps the Constitution in terms of what people can share and consume.” Our ordinary conversations are treated as disposable, at Facebook, Twitter, Microsoft, etc. They explicitly say that they can delete all of your content at any time for any reason. “100s of millions of Americans have accepted that. That should be troubling to those of us who care about preservation.”

You can opt out, but not without compromising your career and having severe social cost. And you can’t rely upon the rest of the Web, because “there’s a war ranging against the open Web.” “The majority of time spent on the Web in the US is spent in an application,” not on pages. Yet we’re still archiving Web pages but not those applications. “They are gaslighting the Web,” Anil says, referring to the old movie. E.g., you can leave FB comments on Anil’s blog, but when you click from FB to his blog, FB gives you a warning that the site you’re going to is untrustworthy. “I don’t do that to them,” he says, even though they’ve consistently “moved the goal posts” on privacy, and he has registered his site with FB.

After blogging this, Anil got a message from a tech at FB saying that it was a bug that’s being fixed. But suppose he hadn’t blogged it, or FB had missed it? “The best case scenario is that we’re left fixing their bugs.” He adds, “That’s pretty awful, because they’re not fixing our bugs. And we’re helping them to extend their prisons over the Web.” And is the only way to get our words preserved is to agree to Twitter’s ToS so that we’ll get archived by the Library of Congress, which has been archiving tweets. Anil says that he’s conscientiously tried to archive his own works for his new baby, but it shouldn’t rely on that much effort by an individual.

And, he says, that’s just the Web, not the apps. You can’t crawl his phone and preserve his photos. And when FB buys Instagram which has a billion photos, and only 5% of the content FB has bought has been preserved…? And yet the Instagram acquisition is considered a success by the Valley. If you’re a Pharaoh, your words are preserved. Anil is worried about the rest of the conversations.

“If I were to ask you what is the most watched form of video, what would you say?” Anil guesses that it’s animated gifs. And we don’t archive them. “We’re talking about the wrong things.” We’re arguing that we should be using Ogg Vorbis, but the proprietary forms are the ones that are most used. The standards ecology is getting more complicated. “We need to reflect back to the tech community that they have an obligation to think about preservation.” They’ve got money and resources. Shouldn’t they be contributing?

We’re losing metadata, he says. You can’t find Instagram photos because they have no Web presence and are short on metadata. Flickr, on the contrary, has lots of metadata. The Instagram owners are now multi-millionaires and are undermotivated to fix this problem. Maybe we’ll get something in 5 years, but then we will have lost a full decade of people’s photos. There’s no way to assign Instagrams open licenses at this point.

Indeed, “they are bending the law to make archiving illegal.” You can’t hack your own phone. You can’t copy your own photos from one device to another.

“Content tied to devices dies when those devices become obsolete.” The obsolesence cycle is becoming faster every year.

So, what should we do?

The technologists building these devices don’t know about the work of archivists. They don’t know that what this group is doing is meaningful. Many are young and don’t yet have experiences they want to preserve. They may not have confronted their own mortality yet.

But, the Web at its base level is about making copies. So, if we get things on the Web as opposed to in apps, we win. Apps should be powered by, or connected to, a Web experience. How can we take advantage of the fact that every time you go to a Web page, you’re copying it? How can we take advantage of the CDN’s, which are already doing a lot of the work needed for preservation?

“There is also a growing class of apps that want to do the right thing.” E.g., TimeHop, that sends you an email reminding you of what you tweeted, etc., a year ago. This puts a user experience around the work of preservation. They’re marketing the value of the preservation community, but they don’t know it yet. Or Brewster, an iPhone address book that hooks up to all the address books you have on social services, reminding you to connect with people you haven’t touched in a while. This is a preservation app, although Brewster doesn’t know.

Then, how do we mine our personal archives? (He notes that his company’s tool, ThinkUp, is in this space.) His Nike fuel band captures data about his physical activity. The Quantified Self movement is looking at all sorts of data. “They too are preservationists, and they don’t know it.”

Then there are institutions. People revere the Library of Congress. Senior people at Twitter speak in a hushed voice when they say, “The tweets go to the LoC.” Take advantage of the institution’s authority. Don’t be shy. Meet them halfway. And say, “By the way, look at my cool email address.”

“PR trumps ToS.” ThinkUp archived the FB activity of the White House. At the time, FB’s ToS forbid archiving it for more than 24 hours. But the WH policy requires it. I said, “Please, FB, please cut off the White House’.” It turns out that FB was already planning on revising the policy. “What a great conversation we would have gotten to have.” You are our advocates, says Anil. You have an obligation to speak on our behalves.

The public is already violating “Intellectual Property” rules. “We don’t look at YouTube as the Million Mixers March, but that’s what it is.” It’s civil disobedience: People violating the law in public under their own names. These are people who recognize the value of preserving cultural works that otherwise would disappear. Sony won’t sell you a copy of Michael Jackson’s Thriller, but there are copies on YouTube. The heart and soul of those posting those videos is preservation. “All they want to do is what you do: make a copy of what matters to them.”

2 Comments »

July 19, 2012

[2b2k][eim]Digital curation

I’m at the “Symposium on Digital Curation in the Era of Big Data” held by the Board on Research Data and Information of the National Research Council. These liveblog notes cover (in some sense — I missed some folks, and have done my usual spotty job on the rest) the morning session. (I’m keynoting in the middle of it.)

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.


Alan Blatecky [pdf] from the National Science Foundation says science is being transformed by Big Data. [I can't see his slides from the panel at front.] He points to the increase in the volume of data, but we haven’t paid enough attention to the longevity of the data. And, he says, some data is centralized (LHC) and some is distributed (genomics). And, our networks are unable to transport large amounts of data [see my post], making where the data is located quite significant. NSF is looking at creating data infrastructures. “Not one big cloud in the sky,” he says. Access, storage, services — how do we make that happen and keep it leading edge? We also need a “suite of policies” suitable for this new environment.


He closes by talking about the Data Web Forum, a new initiative to look at a “top-down governance approach.” He points positively to the IETF’s “rough consensus and running code.” “How do we start doing that in the data world?” How do we get a balanced representation of the community? This is not a regulatory group; everything will be open source, and progress will be through rough consensus. They’ve got some funding from gov’t groups around the world. (Check CNI.org for more info.)


Now Josh Greenberg from the Sloan Foundation. He points to the opportunities presented by aggregated Big Data: the effects on social science, on libraries, etc. But the tools aren’t keeping up with the computational power, so researchers are spending too much time mastering tools, plus it can make reproducibility and provenance trails difficult. Sloan is funding some technical approaches to increasing the trustworthiness of data, including in publishing. But Sloan knows that this is not purely a technical problem. Everyone is talking about data science. Data scientist defined: Someone who knows more about stats than most computer scientists, and can write better code than typical statisticians :) But data science needs to better understand stewardship and curation. What should the workforce look like so that the data-based research holds up over time? The same concerns apply to business decisions based on data analytics. The norms that have served librarians and archivists of physical collections now apply to the world of data. We should be looking at these issues across the boundaries of academics, science, and business. E.g., economics works now rests on data from Web businesses, US Census, etc.

[I couldn't liveblog the next two — Michael and Myron — because I had to leave my computer on the podium. The following are poor summaries.]

Michael Stebbins, Assistant Director for Biotechnology in the Office of Science and Technology Policy in the White House, talked about the Administration’s enthusiasm for Big Data and open access. It’s great to see this degree of enthusiasm coming directly from the White House, especially since Michael is a scientist and has worked for mainstream science publishers.


Myron Gutmann, Ass’t Dir of of the National Science Foundation likewise expressed commitment to open access, and said that there would be an announcement in Spring 2013 that in some ways will respond to the recent UK and EC policies requiring the open publishing of publicly funded research.


After the break, there’s a panel.


Anne Kenney, Dir. of Cornell U. Library, talks about the new emphasis on digital curation and preservation. She traces this back at Cornell to 2006 when an E-Science task force was established. She thinks we now need to focus on e-research, not just e-science. She points to Walters and Skinners “New Roles for New Times: Digital Curation for Preservation.” When it comes to e-research, Anne points to the need for metadata stabilization, harmonizing applications, and collaboration in virtual communities. Within the humanities, she sees more focus on curation, the effect of the teaching environment, and more of a focus on scholarly products (as opposed to the focus on scholarly process, as in the scientific environment).


She points to Youngseek Kim et al. “Education for eScience Professionals“: digital curators need not just subject domain expertise but also project management and data expertise. [There's lots of info on her slides, which I cannot begin to capture.] The report suggests an increasing focus on people-focused skills: project management, bringing communities together.


She very briefly talks about Mary Auckland’s “Re-Skilling for Research” and Williford and Henry, “One Culture: Computationally Intensive Research in the Humanities and Sciences.”


So, what are research libraries doing with this information? The Association of Research Libraries has a jobs announcements database. And Tito Sierra did a study last year analyzing 2011 job postings. He looked at 444 jobs descriptions. 7.4% of the jobs were “newly created or new to the organization.” New mgt level positions were significantly higher, while subject specialist jobs were under-represented.


Anne went through Tito’s data and found 13.5% have “digital” in the title. There were more digital humanities positions than e-science. She posts a lists of the new titles jobs are being given, and they’re digilicious. 55% of those positions call for a library science degree.


Anne concludes: It’s a growth area, with responsibilities more clearly defined in the sciences. There’s growing interest in serving the digital humanists. “Digital curation” is not common in the qualifications nomenclature. MLS or MLIS is not the only path. There’s a lot of interest in post-doctoral positions.


Margarita Gregg of the National Oceanic and Atmospheric Administration, begins by talking about challenges in the era of Big Data. They produce about 15 petabytes of data per year. It’s not just about Big Data, though. They are very concerned with data quality. They can’t preserve all versions of their datasets, and it’s important to keep track of the provenance of that data.


Margarita directs one of NOAA’s data centers that acquires, preserves, assembles, and provides access to marine data. They cannot preserve everything. They need multi-disciplinary people, and they need to figure out how to translate this data into products that people need. In terms of personnel, they need: Data miners, system architects, developers who can translate proprietary formats into open standards, and IP and Digital Rights Management experts so that credit can be given to the people generating the data. Over the next ten years, she sees computer science and information technology becoming the foundations of curation. There is no currently defined job called “digital curator” and that needs to be addressed.


Vicki Ferrini at the Lamont -Doherty Earth Observatory at Columbia University works on data management, metadata, discovery tools, educational materials, best practice guidelines for optimizing acquisition, and more. She points to the increased communication between data consumers and producers.


As data producers, the goal is scientific discovery: data acquisition, reduction, assembly, visualization, integration, and interpretation. And then you have to document the data (= metadata).


Data consumers: They want data discoverability and access. Inceasingly they are concerned with the metadata.


The goal of data providers is to provide acccess, preservation and reuse. They care about data formats, metadata standards, interoperability, the diverse needs of users. [I've abbreviated all these lists because I can't type fast enough.].


At the intersection of these three domains is the data scientist. She refers to this as the “data stewardship continuum” since it spans all three. A data scientist needs to understand the entire life cycle, have domain experience, and have technical knowledge about data systems. “Metadata is key to all of this.” Skills: communication and organization, understanding the cultural aspects of the user communities, people and project management, and a balance between micro- and macro perspectives.


Challenges: Hard to find the right balance between technical skills and content knowledge. Also, data producers are slow to join the digital era. Also, it’s hard to keep up with the tech.


Andy Maltz, Dir. of Science and Technology Council of Academy of Motion Picture Arts and Sciences. AMPA is about arts and sciences, he says, not about The Business.


The Science and Technology Council was formed in 2005. They have lots of data they preserve. They’re trying to build the pipeline for next-generation movie technologists, but they’re falling behind, so they have an internship program and a curriculum initiative. He recommends we read their study The Digital Dilemma. It says that there’s no digital solution that meets film’s requirement to be archived for 100 years at a low cost. It costs $400/yr to archive a film master vs $11,000 to archive a digital master (as of 2006) because of labor costs. [Did I get that right?] He says collaboration is key.


In January they released The Digital Dilemma 2. It found that independent filmmakers, documentarians, and nonprofit audiovisual archives are loosely coupled, widely dispersed communities. This makes collaboration more difficult. The efforts are also poorly funded, and people often lack technical skills. The report recommends the next gen of digital archivists be digital natives. But the real issue is technology obsolescence. “Technology providers must take archival lifetimes into account.” Also system engineers should be taught to consider this.


He highly recommends the Library of Congress’ “The State of Recorded Sound Preservation in the United States,” which rings an alarm bell. He hopes there will be more doctoral work on these issues.


Among his controversial proposals: Require higher math scores for MLS/MLIS students since they tend to score lower than average on that. Also, he says that the new generation of content creators have no curatorial awareness. Executivies and managers need to know that this is a core business function.


Demand side data points: 400 movies/year at 2PB/movie. CNN has 1.5M archived assets, and generates 2,500 new archive objects/wk. YouTube: 72 hours of video uploaded every minute.


Takeways:

  • Show business is a business.

  • Need does not necessarily create demand.

  • The nonprofit AV archive community is poorly organized.

  • Next gen needs to be digital natvies with strong math and sci skills.

  • The next gen of executive leaders needs to understand the importance of this.

  • Digital curation and long-term archiving need a business case.


Q&A


Q: How about linking the monetary value of the metadata to the metadata? That would encourage the generation of metadata.


Q: Weinberger paints a picture of flexible world of flowing data, and now we’re back in the academic, scientific world where you want good data that lasts. I’m torn.


A: Margarita: We need to look how that data are being used. Maybe in some circumstances the quality of the data doesn’t matter. But there are other instances where you’re looking for the highest quality data.


A: [audience] In my industry, one person’s outtakes are another person’s director cuts.


A: Anne: In the library world, we say if a little metadata would be great, a lot of it would be great. We need to step away from trying to capture the most to capturing the most useful (since can’t capture the most). And how do you produce data in a way that’s opened up to future users, as well as being useful for its primary consumers? It’s a very interesting balance that needs to be played. Maybe short-term need is a higher thing and long-term is lower.


A: Vicki: The scientists I work with use discrete data sets, spreadsheets, etc. As we get along we’ll have new ways to check the quality of datasets so we can use the messy data as well.


Q: Citizen curation? E.g., a lot of antiques are curated by being put into people’s attics…Not sure what that might imply as model. Two parallel models?


A: Margarita: We’re going to need to engage anyone who’s interested. We need to incorporate citizen corporation.


Anne: That’s already underway where people have particular interests. E.g., Cornell’s Lab of Ornithology where birders contribute heavily.


Q: What one term will bring people info about this topic?


A: Vicki: There isn’t one term, which speaks to the linked data concept.


Q: How will you recruit people from all walks of life to have the skills you want?


A: Andy: We need to convince people way earlier in the educational process that STEM is cool.


A: Anne: We’ll have to rely to some degree on post-hire education.


Q: My shop produces and integrates lots of data. We need people with domain and computer science skills. They’re more likely to come out of the domains.


A: Vicki: As long as you’re willing to take the step across the boundary, it doesn’t mater which side you start from.


Q: 7 yrs ago in library school, I was told that you need to learn a little programming so that you understand it. I didn’t feel like I had to add a whole other profession on to the one I was studying.

1 Comment »

June 29, 2012

[aspen] Eric Schmidt on the Net and Democracy

Eric Schmidt is being interviewed by Jeff Goldberg about the Net and Democracy. I’ll do some intermittent, incomplete liveblogging…

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

NOTE: Posted without having even been re-read. Note note (a few hours later): I’ve done some basic cleanup.

After some amusing banter, Jeff asks Eric about how responsible he felt Google was for Arab Spring. Jeff in passing uses the phrase “Internet revolution.”

ES: Arab Spring was enabled by a failure to censure the Internet. Google enabled people to organize themselves. Especially in Libya, five different militias were able to organize their armed revolt by using the Net. It’s unfair to the people who died to call it an “Internet revolution.” But there were fewer people who died, in part because of the incessant media coverage. And we’ve seen that it’s very easy to start what some call an Internet revolution, but very hard to finish it.

JG: These were leaderless revolutions, crowdsourced revolution. But in Egypt the crowd’s leaders were easily pushed aside after Mubarek fell.

ES: True leaders are very hard to find. In Libya, there are 80 militias, armed to the teeth. In most of the countries there were repressed Muslim groups that have emerged as leaders because they organized while repressed. Whoever takes over inherits financial and social problems, and will be thrown out if they fail.

JG: Talk about Google’s tumultuous relationship with China…

ES: There are lots of reasons to think that China works because its citizens like its hierarchical structure. But I think you can’t build a knowledge society without freedom. China wants to be a knowledge society. It’s unclear if China’s current model gets them past a middle income GDP. Google thought that if we gave them free access to info, the Chinese people would revolt. We were wrong, and we moved Google to Hong Kong, on the open side of the Great Firewall. (We had to because that’s the Chinese law.) Now when you enter a forbidden query, we tell the user that it’s likely to be blocked. We are forbidden from announcing what the forbidden terms are because we don’t want employees put in jail.

JG: Could Arab Spring happen in China? Could students organize Tianamen Square now?

ES: They could use the Chinese equivalent of Twitter. But if someone organizes a protest, two people show up, plus 30 media, and 50 police.

JG: Google’s always argued that democratization of info erodes authoritarian control. Do you still believe that?

ES: The biggest thing I’ve learned is how hard it is to learn about the differences among people in and within countries. I continue to believe that this device [mobile phone] will change the world. The way to solve most of the world’s problems is by educating people. Because these devices will become ubiquitous, it’ll be possible to see how far we humans can get. With access to the Net, you can sue for justice. In the worst case you can actually shame people.

JG: And these devices can be used to track people.

ES: Get people to understand they have choices, and they will eventually organize. Mobiles tend to record info just by their nature. The phone company knows where you are right now. You’re not worried about that because a law says the phone company can’t come harass you where you’re sitting. In a culture where there isn’t agreement about basic rights…

JG: Is there evidence that our democracy is better off for having the Internet?

ES: When we built the Net, that wasn’t the problem we were solving. But more speech is better. There’s a lack of deliberative time in our political process. Our leaders will learn that they’ll make better decisions if they take a week to think about things. Things will get bad enough that eventually reason will prevail. We complain about our democracy, but we’re doing quite well. The US is the beacon of innovation, not just in tech, but in energy. “In God we trust … all others have to bring data.” Politicians should just start with some facts.

JG: It’s easier to be crazy and wrong on the Net.

ES: 0.5% of Americans are literally crazy. Two years ago, their moms got them broadband connections. And they have a lot of free time. Google is going to learn how to rank them. Google should enable us to hear all these voices, including the crazy people, and if we’re not doing that, we’re not doing our job.

JG: I googled “Syria massacre” this morning, and the first story was from Russia Today that spun it…

ES: It’s good that you have a choice. We have to educate ourselves and our children. Not everything written is true, and very powerful forces want to convince you of lies. The Net allows that, and we rank against it, but you have to do your own investigation.

JG: Google is hitting PR problems. Talk about privacy…

ES: There’s no delete button on the Net. When you’re a baby, no one knows anything about you. As you move through life, inevitably more people know more about you. We’re going to have to learn about that. The wifi info gathering by StreetView was an error, a mistake, and we’ve apologized for it.

JG: The future of journalism?

ES: A number of institutions are figuring out workable models. The Atlantic [our host]. Politico. HuffingtonPost. Clever entrepreneurs are figuring out how to make money. The traditional incumbents have been reduced in scale, but there are plenty of new voices. BTW, we just announced a tablet with interactive, dynamic magazines. To really worry about: We grew up with the bargain that newspapers had enough cash flow to fund long term investigative research. That’s a loss to democracy. The problem hasn’t been fully solved. Google has debated how to solve it, but we don’t want to cross the content line because then we’d be accused of bias in our rankings.

JG: Will search engines search for accuracy rather than popularity?

ES: Google’s algorithms are not about popularity. They’re about link structures, and we start from well-known sources. So we’re already there. We just have to get better.

JG: In 5 yrs what will the tech landscape look like?

ES: Moore’s Law says that in 5 yrs there will be more power for less money. We forget how much better our hw is now than even 5 years. And it’s faster than Moore’s Law for disks and fiber optic connections. Google is doing a testbed optical installation. At that bandwidth all media are just bits. We anticipate a lot of specialty devices.

JG: How do you expect an ordinary, competent politician to manage the info flow? Are we inventing tech that is past our ability to process info?

ES: The evidence is that the tech is bringing more human contact. The tech lets us express our humanity. We need a way of sorting politicians better. I’d suggest looking for leaders who work from facts.

JG: Why are you supporting Obama?

ES: I like having a smart president.

JG: Is Romney not smart?

ES: I know him. He’s a good man. I like Obama’s policies better.

Q&A

Q: Our connectivity is 3rd world. Why haven’t we been able to upgrade?

A: The wireless networks are running out of bandwidth. The prediction is they’ll be saturated in 2016. Maybe 2017. That’s understandable: Before, we were just typing online and now we’re watching movies. The White House in a few weeks is releasing a report that says that we can share bandwidth to get almost infinite bandwidth. Rather than allocating a whole chunk that leaves most of it unused, using interference databases we think we can fix this problem. [I think but please correct me: A database of frequency usages so that unused frequencies in particular geographic areas can be used for new signals.]

A: The digital can enhance our physical connections. E.g., a grandmother skyping with a grandchild.

JG: You said you can use the Net to shame govts. But there are plenty of videos of Syria doing horrible things, but it’s done no good.

ES: There are always particularly evil people. Syria is the exception. Most countries, even autocratic ones, are susceptible to public embarrassment.

Q: Saying “phones by their nature collect data” evades responsibility.

ES: I meant that in order to their work, they collect info. What we allow to be done with that info is a legal, cultural issue.

Q: Are we inherently critical thinkers? If not, putting info out there may not lead to good decisions.

ES: There’s evidence that we’re born to react quickly. Our brains can be taught reasoning. But it requires strong family and education.

Q: Should there be a bill of rights to simplify the legalese that express your privacy rules?

ES: It’s a fight between your reasonable point of view, and the lawyers and govt that regulate us. Let me reassure you: If you follow the goal of Google to have you as a customer, the quickest way to lose you is to misuse your information. We are one click away from competitors who are well run and smart. [unless there was money in it, or unless they could get away with it, or...]

Q: Could we get rid of representative democracy?

ES: It’ll become even more important to have democratic processes because it’s all getting more complicated. For direct democracy we’d have to spend all day learning about the issues and couldn’t do our jobs.

JG: David Brooks, could you comment? Eric is an enormous optimist…

ES: …The evidence is on my side!

JG: David, are you as sanguine that our politicians will learn to slow their thinking down, and that Americans have the skills to discern the crap from the true.

David Brooks: It’s not Google’s job to discern what’s true. There are aggregators to do this, including the NYT and TheBrowser. I think there’s been a flight to quality. I’m less sanguine about attention span. I’m less sanguine about confirmation bias, which the Web makes easier.

ES: I generally agree with that. There’s evidence that we tend to believe the first thing we hear, and we judge plus and minus against that. The answer is always for me culture, education.

Q: Will there be a breakthrough in education?

ES: Education changes much more slowly than the world does. Sometimes it seems to me that education is run for the benefit of the teachers. They should do measurable outcomes, A-B testing. There’s evidence that physics can be taught better by setting a problem and then do a collaborative effort, then another problem…

1 Comment »

[aspen] Robert Putnam on the growing class gap

Robert Putnam is giving a talk at the Aspen Ideas Festival called “Requiem for the American Dream? Unequal Opportunity in America.” It’s a project in its middle stages, he says. If a book comes out of the research, it’s a year or two out.

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

There’s a difference between the inequality of income and wealth, and inequality of opportunity. Historically Americans have not cared much about income inequality…less than is typical of the rest of the world. But we do care about unequal opportunity and unequal social mobility. Concern about income inequality has sometimes divided along party lines, but not opportunity inequality. Historically we’ve been better than most other countries in the distribution of opportunity.

Income distribution has become more skewed since the 1970s in America, and in many other countries. We’ve also become a more class-segregated society, even as we’ve become less segregated by race and religion. Class segregation is increasing by residence, education, organization, and marriage; i.e., it’s less likely you’ll marry outside of your class. There’s also been a fraying of family and social bonds within the working class. Robert asks if this has an effect on the growing inequality of opportunity.

He talks about his methodology. The standard way of measuring social mobility compares 30-somethings’ economic/educational/social standing with their parents’ standing when the parents were in their 30s. But this means all the action is at least twenty years old; the latest studies look at people raised in the 1980s. Robert is instead looking at today’s kids, to avoid the 20-30 year old blind spot in the rearview mirror. “If we look out the windshield, we’re about to go over a cliff when it comes to social mobility…Social mobility and opportunity are going to plummet.”

If something is important enough to write about, it ought to show up in multiple measures, he says. He will show us robust patterns in multiple data sources, focusing only on class differences. He’s only looking at white youth for now, because while racial gaps remain important, they are increasingly based on class, not race. It’s important to look at race issues, but that’s not what Robert is considering. He says if you look at race as well, the social mobility trends look even worse. So, he’s going to show us growing class gaps over the past 30 years among white kids with 2-parent families.

He shows charts. The rate of births to unmarried mothers who are college grads hasn’t changed. But the percentage of those births to women with some college and women with no college has significantly increased. About a third of the births to unmarried women are to women with some college; it’s about half for women with no college. Meanwhile, the racial gap (i.e., race controlling for class) fell dramatically while the class gap (class controlling for race) grew at about the same rate. I.e., high school educated white folks are behaving more and more like high school educated non-white folks. “I’m not saying race doesn’t matter. I’m saying class matters a whole lot more. And race matters a whole lot less.”

Another chart. ” Over the last two decades or so, white kids coming from less educated, less well-off backgrounds are more and more going through life with only one parent at home.”

A chart of the “growing class gap in enrichment expenditures [day care, tutors, games, etc., but not private school] on children, 1972-2006.” At the bottom of the hierarchy, the expenditure has increased about $400 per child over the past 40 years, but at the middle income, it’s gone up $5K.

The time people invest in their kids — reading to the kids, etc., but not including diaper changing time, etc. — again shows a growth gap between those with a higher ed and those without. In the 1970s, moms with only HS were investing slightly more time with their kids. Now the number of minutes for both is going up, the growth has been “much much faster” among college educated moms. When you add in the dads, the gap grows even larger — it’s up to an hour a day more quality time with their parents.

When in the lifecycle of the child is the class gap biggest? It’s concentrated among infants. “It’s terrible. Just terrible.”

How are kids connecting at schools? Looking at participation in extracurricular activities, and activities outside school like music lessons, dance lessons, art lessons, etc., excluding sports. (This is, he reminds us, only data about white kids, but the class gap gets worse if you add in non-white kids.) Kids in the lower income quartile have declining participation rates. Those in the highest quartile have a growing rate. (The decline began sharply in 1982.)

Same chart for participation in sports. For kids becoming team captains, it’s stayed steady for the lower quartile kids. Middle class kids were always more likely to become team captains, but now 26% of them say that they’ve been team captains. “Think about what kids are learning” from these activities: how to get along with kids, how to make connections with people who are not like them… the skill set we need in this new world. (Robert tells us HS football was invented by progressives about 100 years ago as a way to get kids from all classes playing together.)

Outside school in music, dance and art lessons: same growing gap.

There is a declining gap in participation in student government. But that’s happening in part because upper quartile kids are choosing not to participate. The other area in which the gap is declining is in “vocational clubs,” e.g., shop, motorcycle club. Again, the upper class kids are declining to participate.

Church-going: All are decreasing, but the upper third is decreasing much less rapidly. “There’s been a catastrophic drop in church attendance among children of working class parents.”

The chart of comunity volunteering is more complex, but overall the upper tercile has been rapidly increasing, while the bottom tercile has been dopping in the 2000′s. One possible explanation: Robert points out that colleges like to see community volunteering on applications.

Chart of social support: Do you have someone you can count on? Sharply increasing gap. Working class kids: it’s been pretty flat. Ask “Would you say most people can be trusted?” and you’ll see a long-term decline among the kids of parents in the bottom tercile, while the upper tercile kids have become more trusting. “And why not?” The upper class kids have plenty of social support, while the support systems are being withdrawn from the bottom tercile kids.

All this shows up in reading and math test scores. Increasing gap mapped against class. Declining gap mapped against race.

Bottom line: “There’s a growing class gap among American youth among all the predictors of success in life.” “A social mobility crash is coming” as these cohorts move into adulthood. Everyone who’s looked at the data agrees, he says.

But what has this happened? We don’t know for sure. About ten years ago he was in the White House talking with Pres. Bush, Karl Rove, and others, talking about this. (He charmingly apologizes for namedropping.) The first question W asked was “How much of this has do to family structure?” A: A little less than half. Even if you look only at 2-parent families, the gap is there but only about half of the size. None of it is due to immigration. But, suggested Robert, it might be due to the income gap. Then Laura Bush said “If you don’t know how long you’re going to keep your house or your job, you have less energy to invest in your kids.” Robert thinks this makes sense.

Possible explanations: (1) Upper class families have increased their investment in cognitive and non-cognitive development. (2) Collapse of white working class. (3) Laura Bush’s hypothesis. (4) The social safety nets are gone: churches, sports leagues, parks and rec, etc. “If the chick falls out of the nest, all that’s down there are gangs.” It is, he says, a perfect storm.

This is a problem that the two parties should be able to cooperate on, if they could cooperate about anything. We need to boost caring families, boost jobs and wages for the bottom half of the workforce, invest in public education, invest in in high quality Head Start, and have more reliable volunteer mentors. “I don’t know what else we can do to fix the problem…but if we don’t fix it we’re writing off a third of our workforce. And, it’s just not fair.” Until we think of all of these kids as our kids, “we’re in a pickle.”

Q&A

Q: Any data on kids of parents in the military?

A: I don’t have data. I wish I did because enlisted men and women are mostly drawn from the lower class, and my hunch is that their kids are doing better than non-military kids. I think that the discipline instilled into the military maybe carries over into the structure of the families.

Q: I teach HS in a rural area of OR. We got multicultural sensitivity training. I asked why aren’t we talking about class because I see it every day. I hope your work translates into teacher training.

A: Surprisingly to me, when I talk to groups, almost always when elementary school teachers speak up, they say they see this problem in their own class.

Q: Harrington, NSF, others have said the same thing over the years. This is the fourth time I’ve seen the same red flag. How does this translate into policy?

A: This particular growing gap wasn’t true in Harrington’s day. There’s always been a class gap in American society but it’s way worse than in the ’60s. I’m working on a book aimed at a mass market. We’re gathering the stories of these kids. I’m hoping that if you talk about real kids, it will get people’s attention. I desperately fear we’re going to have a partisan argument about who’s to blame, and I don’t care about that.

Q: Some will argue about the cost.

A: It’ll be much more expensive not to fix this. These lower third kids will be on unemployment and in prison….

Q: College admissions could be fixed…

A: Most of the damage is done way before then.

Q: The social safety net is under constant resource. We need govt policy and money. Our churches and philanthropies are not enough.

A: I’m a progressive Democrat so I think govt has a role to play. But this is a fundamental American issue. It does have to involve churches. I happen to think that hugs and time are more important than money, but money is important too.

7 Comments »

[aspen][2b2k] Ideo’s Tim Brown

Tim Brown of Ideo is opening his Aspen Ideas Festival talk with a slide presentation called “From Newton to Design”. He says he’s early in thinking it through.

He points to a problem in how we’ve thought about design, trained designers, and have practiced design. The great thing about designing simple products is that you can know almost everything about them: who made them, who they’re for, how they were produced, etc. But as products get more complicated, it gets harder even for a team of designers to really understand what’s going on. They get so complicated that there are lots of places design can fail.

When we go out to urban planning , that becomes even more obvious, he says. He shows Union Sq. when it was designed and how wildly NYC has grown around it. Or, at the Courtyard Marriott chain, every element of the user’s experience has been thought through. He shows a script that specifies every interaction. But you can’t anticipate everything. E.g., JetBlue is one of the best designed customer experiences and even they got it wrong a couple of winters ago.

What’s going on? It’s all about complexity. Henri Poincaré in the 19th century tried to solve the three body problem that had been set by the French govt as an open source competition. HP couldn’t solve it. It sounds like a simple problem, but it’s very hard. [BTW, there's a fascinating history of three French aristocrats hand-computing the movement of Halley's Comet, which depended on calculating the gravitational influences of multiple bodies. Can't find the ref at the moment]

Our basic ideas about design have been based on Newton, says Tim. Design assumes the ability to predict the future based on the present. We need to think more like Darwin: design as an evolutionary process. Design is more about emergence, never finished.

He presents a few principles of Darwinian design that he’s been exploring.

1. Design behaviors, not objects — the behaviors that come from our interactions with objects. If you’ve traveled on the high speed trains in Europe, there are signs urging men to be more accurate when peeing. But at Schiphol Airport, they print a fly at the right spot in the urinal; men became 80% more accurate. That’s designing behavior; the actual object doesn’t matter.

2. Design for information flow. Nicholas Christakis has looked at how networks affect behavior. Tesco uses its loyalty card — which cost them 20% of their margins — to increase sales.

3. Faster iteration = faster evolution. Viruses evolve faster than we do because they iterate faster than we do. E.g., State Farm tried out a new idea how to build relationships with the new generation. They built one storefront for this, and learned from it. “Launch to learn.”

4. Use selective emergence. This intrigues him, alathough he doesn’t know how useful it will be in design. Rather than random mutations, you choose what might be interesting and design things that get us there through many iterations. I.e., genetic algorithms. E.g., the Strandbeest walks along beaches with a hip joint unlike any in nature because the artist used genetic algorithms.

5. Take an experimental approach. I.e., testing hypotheses. Cf. Eric Ries, the Lean Startup (build, measure, learn). E.g., Ideo.org has been working on sanitation in Ghana. Where you can’t dig septic pits, Ideo has been experimenting with low cost receptacle toilets (with bio-digesters). But people didn’t want to pay for the service. So, they gave some to families and went away for three days. All the families changed their minds and said they are willing to pay for the service (which is provided by a local franchise).

6. Focus on simple rules. This comes from emergence theory. E.g., complex bird flocking patterns are based on simple rules. [Canonical example: Termite mounds.] E.g., Bi-Rite stores in SF uses simple rules: If an employee is within 10′ of a customer, you look the customer in the eye. If within 4′, you talk with them. This creates a wonderful service experience.

7. Design is never done. E.g., World of Warcraft is constantly being designed by its players.

8. The power of purpose. This creates the self-governance these complex environments succeed. Arab Spring and Occupy Wall Street are examples. Companies are experimenting with new ways of thinking about their business and products. E.g., Patagonia tells you not to buy its products because it also wants to preserve the environment.

The prototypical design artefact is a blue print. Once you created the blue print, the design was done. It was the instruction set for someone to make it. That’s how we think about design: finish and done. What replaces it: Code. It might be DNA (and Tim has people researching this), but more often it’s programming code. It’s an instruction set that can continue to evolve.

Now James Fallows [swoon] interviews him.

JF: You embody your principles. The rules are differen from a prior version. [ACK! Crash. Missed about 2 minutes]

TB: We’ve just finished designing the prototype experience for the new health care exchanges. It will affect how people choose which health care insurance to choose. Today it’s done with paper. Under the new health care laws, lots of people will get to make these choices. We worked with the CA Healthcare Foundation to prototype the user experience. What are the key pieces are parts? How can we keep the choices reasonably simple? Then each state will use this a platform to develop their own.

JF: And the govt had the wit to come to you to do this?

TB: The CA Health Care Foundation…

JF: What are the barriers? Does it cost more to do it your way?

TB: It’s often less costly. Most often they don’t have a good understanding of what their customers go through. When a health care org comes to us, relatively frequently we find out that a senior exec had to go through the health care experience. It’s true of all organizations. We don’t ask the right questions. The urgency to change is not there, and the resistance to change is always huge.

JF: Has the TSA come to you?

TB: Yes, but … well, we learned a lot. In the previous admin, we worked with them to find areas of change. Although going through the scanners has to improve, a lot of it has to do with the behavior of the people. They looked at a training program that was intended to take away some of the rule-based system they used. The more rules you apply, the less sensitive the system is. You need to give the people in that system much more independence to make judgments.

JF: Who do you hire?

TB: We look for a wide range of people. Many disciplines. We look for deep skills, and for empathy. It’s hard to solve problems for others without that. Also, most of what we do is too complex for individuals, so we work in teams, and thus people need an enthusiasm for empathy.

JF: Any unusual interview techniques?

TB: We put people into a situation in which they’re practicing design. E.g., intern program. Also, competitions. And we use Open Ideo as a way of seeing how people work.

JF: Beyond the toilet, what else are you doing for “design for poverty.”

TB: I got excited when I saw the opportunities for design in some social design work. At Open Ideo we’re working on clean water, early ed programs, etc. Ideo.org is a non-profit org. We want it to be sustainable and scalable so we look for external funding for it.

JF: How do you approach environmental sustainability?

TB: We try to build that into every project. Every project affects the environment. We try to bring sustainable thinking around systems, materials, energy flows, etc.

JF: What projects are you proudest of?

TB: The work we do in health care, including with Kaiser Permanente. Also, consumer-facing, post-crash financial services. PNC digital wallet. “Keep the change.” Etc. This is not an area where design has had much to do.

JF:

TB: For physical objects, it peaked maybe 20-30 years ago (with Apple as an exception). But we’re in ascendance for behavior-based designed. We get 25,000 apps a year for 100 openings. We’re a 600-person company. Etsy, Kickstarter, sw designed better than ever before…great things are happening. Soon if not already the number of digital designers will be greater than all other designers combined.

Q & A

Q: Your principles are so close to Buckminister Fuller’s [says the guy from the Fuller institute]. But the boundary between social and evolutionary systems is illusory.

TB: Yes, Fuller figured this out a long time ago. We’re perhaps resurrecting ideas, as every generation does. Design has operated as a priesthood for too long. When I started, I was only interested in how beautiful something is. That’s so much simpler. Opening design up to many more will convince us all that we’re all part of this big design ecosystem and have a responsibility to be thoughtful about the contributions we’re making to the world around us. I hope professional designers learn to enable that, more than controlling it. The B School at Stanford is introducing non-designers to design, which is great.

Q: What can we do to simplify the rules?

TB: The unstated bit of my thesis is that you still have to stop and design something. We develop an idea, perhaps more through iteration. That process doesn’t change. For rebuilding a complex system, maybe big data will help us to see patterns that allow us to understand what we’re designing’s complex effects…but I don’t think we’re there yet. We should be thinking about the hooks we’re building in. I’m big into APIs that allow other people to build with what you’ve built.

Q: Is it training or DNA that determines a good employee for you?

TB: Both. We hire people straight out of grad school because they’re moldable. We hire older people, but it’s harder for them to adapt. I don’t have much control as CEO. The future of all businesses is to have cultures that are a s self-governing as possible. That’s much more resilient and agile than cultures built on inflexible rule sets.

Q: I chair a land conservancy. We create parks in urban areas. Does Ideo have much experience in designing to create behaviors that will get people to use parks? What’s your view of the state of park design?

TB: We don’t have a lot of expertise in designing anything because we like designing everything. The High Line and the West Side park in NYC are remarkable examples. Projects like that show that parks can be remarkable assets to the city. We’re working with High Line on the third phase of that project. NYC’s life expectancy has gone up 3 yrs. Two explanations: People are closer to health facilities, and people walk more.

Q: What are the logistics of running a decentralized org? Mentoring? Sharing a vision?

TB: Purpose creates a sense of direction, so we talk about why the heck we’re doing what we’re doing. We think we should measure everything we do based on the impact it has on the word. We’ve done an occasionally decent job of mentoring; that can be a problem with a decentralized org. It’s a tension. Most of our employees probably want more mentoring, but we also want autonomy. We are not big believers in warehousing knowledge. Designers hate reusing other people’s ideas. It’s much better to have knowledge systems that inspire people to think in new ways. So we’re a storytelling culture. It’s a bit of an obsession of ours. If you do a piece of work, your job is to have some stories to tell about it. That’s more effective than big reports that live in a database somewhere.

(JF calls for all remaining questions)

Q: My group works with at-risk youth. Education is increasingly standards based, but your work is collaborative.

Q: How do you look at chaos? People in open markets are open and affectionate. In corporate controlled spaces, people shut down.

Q: Does form drive function or vice versa?

Q: Apple is a closed system. Google wants more control. Open vs. controlled systems?

TB: 1. University ed is not always the best way to teach entrepreneurship. Apprenticeships are interesting. 2. Great markets are vibrant, but not chaotic. I take clients to the Ferry building to point out all the interrelated pieces that make that such a great experience. It’s not top down, but you can see the patterns and use them as inspiration. 3. Form follow function? Hard to kick that notion because I believe in beautiful engineering, but most things we’re designing today have hundreds of functions, so you can’t get a single form for it. 4. I love closed systems but I think they’re inevitably part of an open system. IOS is part of an open system of everything else that I do with it. We need both. [At last! Something I disagree with! Sort of! :)]

[Fantastic. I've been a huge fan of Ideo's work, and Ideo's organizational ethos, and Tim Brown, for a long time. So I felt particularly narcissistic as I heard this talk through Cluetrain and Too Big to Know lenses. Substitute "knowledge" for "design" and you get a lot of the ideas in 2b2k. To hear them coming from Tim Brown, who is a personal idol of mine, was a self-centered thrill.]

5 Comments »

« Previous Page | Next Page »