The problem with big data correlations
“The people using big data don’t presume to peer deeply into people’s souls,” argues David Brooks in the New York Times (in an April 2013 column). “They don’t try to explain why people are doing things. They just want to observe what they are doing.”
“The theory of big data is to have no theory, at least about human nature. You just gather huge amounts of information, observe the patterns and estimate probabilities about how people will act in the future. [In other words,] this movement asks us to move from causation to correlation.”
While this approach has yielded some impressive results, he says, one should be also aware of its limits and goes on to list four of them:
- The challenge of discerning meaningful correlations from meaningless ones
- People are discontinuous and the passing of time can produce gigantic and unpredictable changes in taste and behavior
- The world is error-prone and dynamic
- The distinction between commodity decisions and flourishing decisions
Interesting too, his conclusion: “My worries mostly concentrate on the cultural impact of the big data vogue. If you adopt a mind-set that replaces the narrative with the empirical, you have problems thinking about personal responsibility and morality, which are based on causation. You wind up with a demoralized society.”