North by Northwest movie imageAccording to Big Data, if the US government wants to reduce suicides, it must reduce investment in science. Strange? The following graph shows that both variables are correlated to 99.8%.

Of course, common sense tells us that something is wrong. Developing the trend analysis by ourselves can give us a great competitive advantage. But doing it from the pure data, without causal analysis, is dangerous. In a world with huge data available, Big Data allows us to discover "possible" correlations, even automatically. But this does not mean that there is a causal relationship. As Nassim Taleb says in his Black Swan, the human being is programmed to identify patterns and infer causality, since his survival has depended on it throughout evolution. But this generates inconveniences such as superstition, and also big decision mistakes when we apply Big Data in the company.

Spurious correlation exampleFrom Spurious correlations (Tyler Vigen)

Chris Anderson, head of Wired magazine, says that the development of massive data or Big Data makes the theory superfluous. But we must be cautious in the face of Big Data's great hopes. 80% of the data lakes will end up being inefficient, since they will not include effective metadata management, and 70% of the deployments in Hadoop will not reach the expected profitability objectives (According to Gartner).
Oxford professor Viktor Mayer-Schöngerger and The Economist data editor Kennet Cukier (in his book Big data, the massive data revolution) explain that the massive use of Big Data moves us away from the traditional search for causality .

But establishing causality is fundamental to developing good Strategic and Prospective Intelligence. We need to establish causal relationships between change drivers of our market, or of the sector in other markets. They can be key change drivers in the Strategic horizon, or in the Prospective horizon. Thus, fashions in flavors in Italy and the US will influence the launch of new snacks, desserts and yogurts, and that is strategic information. Regulation in favor of energy self-consumption in other EU countries will eventually put pressure on the government of Spain in 10 years from now, and that is information for foresight.

The causality relationships are not established by the Big Data mathematics, but by the good work of the analysts. The analyst's function is essential to generate knowledge; and on that knowledge build the best business decisions. The exclusive support of the statistics, without the expert analysis that processes key signals, is what makes "Nate" Silver or Goldman Sachs fail.

 

correlation and causalityBy Yanir Seroussi

The staff of our company is who can establish the causal relationships between the change drivers of our market, with the support of external consultants if we so decide. Of course, the automatic systems can identify correlations, but in the case that they find them, we should not trust them. A correlation can mean chance, not causality. You have to understand our business well to filter those correlations.

We must start from that expert understanding of our business to identify the causalities, and with them look for the information we need to make decisions. The knowledge of our team in the company is invaluable to go upstream the river of causality, and identify what we should know in advance. Recall the case of baby items.

Byung-Chul Han, a philosopher at the University of the Arts in Berlin, tells us that "Positive science, guided by data, produces no knowledge or truth". This is a provocative sentence - that is what he intends - but it is worth remembering that making business decisions on the basis of non-causal correlations is like playing Russian roulette: You may succeed - then you are still commercially alive - but you may not - then you lose everything. It's not worth the risk. The impact of failing on the company's strategy is too great to be played by lot.

Therefore, causality and not chance. Let`s trust our team of analysts, and not exclusively Big Data.

 


(North by Northwest is a film of Alfred Hitchcock shot in 1959. The protagonist is immersed in a conflict by a wrong correlation from a coincidence).

By Miguel Borrás

antara is committed to the fact that the published contents are created by its own team, clients or collaborators. Antara never subcontracts content generation.
The opinions of the authors reflect their own points of view, and not those of the company.

 

Disney de 1933, basada en la tradicional fábula Los tres cerditos)