I’ve been posting a bit lately about data and theory, and the other week I excerpted the Stanford Encyclopedia of Philosophy’s entry on big data and science. I want to return to that topic through the lens of economics.
In short, the proliferation of data can be thought of as an economic shock and basic economic theory would then predict that it would play a greater role in science.
In an article that became the book Prediction Machines, economists Ajay Agrawal, Joshua Gans, and Avi Goldfarb talk about AI as a drop in the cost of prediction:
Technological revolutions tend to involve some important activity becoming cheap, like the cost of communication or finding information… When the cost of any input falls so precipitously, there are two other well-established economic implications. First, we will start using prediction to perform tasks where we previously didn’t. Second, the value of other things that complement prediction will rise…
As a historical example, consider semiconductors, an area of technological advance that caused a significant drop in the cost of a different input: arithmetic. With semiconductors we could calculate cheaply, so activities for which arithmetic was a key input, such as data analysis and accounting, became much cheaper. However, we also started using the newly cheap arithmetic to solve problems that were not historically arithmetic problems. An example is photography. We shifted from a film-oriented, chemistry-based approach to a digital-oriented, arithmetic-based approach. Other new applications for cheap arithmetic include communications, music, and drug discovery.
What does that mean for science and the role of data? As the cost of collecting data drops, scientists will use it more. For example, as the Stanford entry suggests, some see data-driven exploration as a substitute for traditional methods of hypothesis generation. If that’s the case, economic theory would expect the former to become more common and the latter less so. But what about theory? Most people would say theory is a complement to data, not a substitute, in which case its value should rise. This offers a sort of a synthesis position between advocates of data and theory at present: data-driven methods will and should become more common. But that shift makes theory more important, not less.
Obviously, this is all super speculative. Just thinking through the analogy.