Joho the Blog » [2b2k][sogeti] Big Data conference session
Everyday Chaos
Too Big to Know
Too Big to Know
Cluetrain 10th Anniversary edition
Cluetrain 10th Anniversary
Everything Is Miscellaneous
Everything Is Miscellaneous
Small Pieces cover
Small Pieces Loosely Joined
Cluetrain cover
Cluetrain Manifesto
My face
Speaker info
Who am I? (Blog Disclosure Form) Copy this link as RSS address Atom Feed

[2b2k][sogeti] Big Data conference session

I’m at Sogeti‘s annual executive conference, which brings together about 80 CEOs. I’m here with Doc Searls, Andrew Keen, and others. I’ve spoken at other Sogeti events, and I am impressed with their commitment to providing contrary points of view — including views at odds with their own corporate interests. (My one complaint: They expect all attendees to have an iPad or iPhone so that they can participate in on the realtime survey. Bad symbolism.) (Disclosure: They’re paying me to speak. They are not paying me to say something nice about them.)

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

Menno van Doorn begins by talking about the quantified self movement, claiming that they sometimes refer to themselves as “datasexuals” :) All part of Big Data, he says. To give us an idea of bigness, he relates the Legend of Sessa: “Give me grain, doubling the amount for each square on a chessboard.” Exponential growth meant that by the time you hit the second half of the chessboard, you’re in impossible numbers. Experts say that’s where we were in 2006 when it comes to data. But “there’s no such thing as too much data.” “Big Data is powering the next industrial revolution. Data is the new oil.”

Big Data is about (1) lots of data, (2) at high velocity, (3) using in a variety of ways. (“volume, velocity, variety.”) Michael Chui says that there’s billions in revenues to gain, including from efficiencies. But, Chui says, there are no best practices. The value comes from “human exhaust.” I.e., your digital footprint, what you leave behind in your movement through the Net. Menno thinks of this as “your recorded future.”

Three examples:

1. Menno points to Target, a company that can predict life-changing events among its customers. E.g., based on purchases of 25 products, they can predict which customers are pregnant and roughly when they are due. But, this led to Target sending promotional materials for pregnancy to young girls whose parents learned this way that their daughters were pregnant.

2. In SF, they send out police cars to neighborhoods based on 14-day predictions of where crime will occur, based on data about prior crime patterns.

3. Schufa, a German credit agency, announced they’d use social media to assess your credit worthiness. Immediately a German Minister said, “Schufa cannot become the Big Brother of the beusiness world.”

Two forces are in contention and will determine how much Big Data changes us. Today, the conference will look at the dawn of the age of big data, and then how disruptive it will be for society (the session Keen and I are in). Day 2: Bridging the gap to the new paradigm, Big Data’s fascinating future, and Decision Time: Taming Big Brother.


Carlota Perez, Prof. of Tech and Socio-Economic Development, from Venezuela speaks now.. She is a “neo-Schumpeterian.” She says her role in the conference is “locate the current crisis.” What is the real effect on innovation, and why are we only midways along in feeling the impact?

There have been 5 tech revolutions in the past 240 yeares: 1. 1771 Industrial rev. 1829. Age of steam, coal and railways. 3. 1875 Steel and heavy engineering (the first globalization). 4. Age of he automobile, oril, petrochem and mass production 5. 1971 Age of info tech and telecom. We’re only halfway through that last one. The next revolution queued up: age of biotech, bioelectronics, nanotech, and new materials. [I’m surprised she doesn’t count telegrapgh + radio + telephone, etc., as a comms rev. And I’d separate the Net as its own rev. But that’s me.]

Lifecycle of a tech rev: gestation, induction, deployment, exhaustion. The “big bang” tends to happen when the prior rev is reaching exhaustion. The structure of revs: new cheap inputs, new products, new processes. A new infrastructure arise. And a constellation of new dynamic industries that grow the world economy.

Why call these “revolutions”, she asks? Because they transform the whole economy. They bring new organizational principles and new best practice models. I.e. , a new “techno-economic paradigm.” E.g., we’ve gone from mass production to flexible production. Closed pyramids to open networks. Stable routines to continuous improvement. “Information technology finds change natural.” From human resources to human capital (from raw materials to value). Suppliers and clients to value network partners. Fixed plans to flexible strategies. Three-tier markets (big,medium,small) to hyper-segmented markets. Internationalization to globalization. Information as costly burden to info as asset. Together, these constitute a radical change in managerial common sense.

The diffusion process is broken in two: Bubble, followed by a crash, and then the Golden Age. During the bubble, financial capital forces diffusion. There is income and demand polarization. Then the crash. Then there is an institutional recomposition, leading to a golden age in which everyone benefits. Production capital takes over from financial capital (driven by the govt), and there is better distribution of income and demand.

She looks at the 5 revs, and finds the same historic pattern that she just sketched.

wo major differences between installation and deployment: 1. Bubbles vs. patient (= long-term) capital. 2. Concentrated innovation to modernize industries vs. innovation in all industries that use the new technologies. “Understanding this sequence is essential for strategic thinking.”

The structure of innovation in deployment: pa new coherent fabric of the economy emerges, leading to a golden age. Also, oligopolies emerge which means there’s less unhelpful competition. (?)

Example of prior rev: home electrical applicances: In the installation period, we had a bunch of electric utilities going into homes in the 1910s and 1930s. During the revision, we get a few more. But then in the 1950-70s. we get a surge of new applicances, including tape recorder, microwave, even the electric toothbrush. It’s enabled by universal electricity and driven by suburbinization. It’s the same pattern if you look at textile fibers, from rayon and acetate during instlation, to a huge number during deployment. E.g., structural and packaging plastics: installation brought bakelite, polystyrene and polyethylene, and then a flood of innovation during deployment. “The various systems of the ICT revolution will follow a similar sequence.” [Unless it follows the Tim Wu pattern of consolidation — e.g., everyone being required to use an iPad at a conference] During installation period, ICT was in constant supply push mode. Now must respond to demand pull. “The paradigm and its potential are now understood by all. Demand (in vol and nature) becomes the driving force.

This shifts the role of the CIO. To modernize a mature company, during installation you brought in an expert in modernization, articulating the hw and sw being pushed by the suppliers. During the deployment phase, a modern company that is innovating for strategic expansion, the CIO is an expert in strategy, specifying needs and working with suppliers. “The CIO is no longer staff. S/he must be directly involved in strategy.”

There are 3 main forces for innovation in the next 2-3 decades, as is true for all the revs. 1. Deepening and widening of the ICT tech rev, responding to user needs. 2. The users of ICT across all industries and activities. 3. The gestation of the next rev (probably bioteech, nanotech, and new materials).

Big Data is likely have a big role in each of those directions.

Q: Why are we only 50% of the way through?

A: Because the change after the recession is like opening a dam. Once you get to the point where you can have a comfortable innovation prospective, imagine the market possibilities.

Q: What can go wrong?

A: Governments. Unfettered free markets are indispensable for the installation process. Lightly guided markets are needed in the golden age. Free markets work when you need to force everyone to change. But now no longer: The state has to come in . But govts are drunk with free markets. Now finance is incompetent. “They don’t dare invest in real things.” Ideology is so strong and the understanding of history is so shallow that we’re not doing the right thing.”


Christopher Ahlberg speaks now. He’s the founder of Recorded Future. His topic: “Turning the Web into Predictive Signals.”

We see events like Arab Spring and wonder if we could have predicted them. Three things are going on: 1. Moving from smaller to larger datasets. 2. From structured to unstructured data (from numbers to text). 3. From corporate data to Internet/Web.

There’s a “seismic shift in intelligence” “emporal indexing of the Web enables Web intelligence.” The Web is not organized for finding date; it’s about finding documents.” Can we create structure for the Web we can use for analysis? A lot of work has been done on this. Why is this possible now? Fast math, large, fast storage, web harvesting, and linguistic analysis progress.

His company looks for signals in human language. E.g., temporal signals. That can turn up competitive info. But human language is tough to deal with. But also when something happens — e.g., Haitian earthquake — there are patterns in when people show up: helpers, doctors, military, do-gooder actors, etc. There tends to be a flood of notifications immediately afterwards. The Recorded Data platform does the linguistic analysis.

He gives an example: What’s going to happen to Merck over the next 90 days. Some is predictable: There will be a quarterly financial conference all. A key drug is up for approval. Can we look into the public conversations about these events, and might this guide our stock purchases? And beyond Merck, we could look at everything from cyber attacks to sales opportunities.

Some examples. 1. Monitoring unrest. Last week there were protests against Foxconn in China. Analysis of Chinese media shows that most of those protests were inland, while corporate expansion is coming in coastal areas. Or look at protests against pharmaceuticals for animal testing.

Example 2: Analyzing cyber threats. Hackers often try out an approach on a small scale and then go larger. This can give us warning.

Example 3: Competitive intelligence. When is there a free space — announcement-free — when you can get some attention. Example 4: Lead generation. E.g., look for changes in management. (New marketing person might need a new PR agency.) Exasmple 5: Trading patterns. E.g., if there’s bad news but insiders are buying.

Conclusion: As we move from small to large datasets, structured to unstructured, and from inside to outside the company, we go from surprise to foresight.

Q: What is the question you cannot answer?

A: The situations that have low frequency. It’s important that there be an opportunity for follow-up questions.

Q: What if you don’t know what the right question is?

A: When it’s unknown unknowns, you can’t ask the right question. But the great thing about visualizaton is that it helps people ask questions.

Q: How to distinguish fact from opinion on Twitter, etc.?

A: Or NYT vs. Financial Post. There isn’t a simple answer. We’re working toward being able to judge sources based on known outcomes.

Q: Do your predictions get more accurate the more data you have?

A: Generally yes, but it’s not always that simple.

Previous: « || Next: »

Leave a Reply

Comments (RSS).  RSS icon