Data and theory in economics

Noah Smith on the Nobel for the architects of the “credibility revolution” in economics:

Anyone who expects the credibility revolution to replace theory is going to be disappointed. Science seeks not merely to catalogue things that happen, but to explain why — chemistry is more than a collection of reaction equations, biology is more than a catalogue of drug treatment effects, and so on. Econ will be the same way. But what the credibility revolution does do is to change the relationship between theory and evidence. When evidence is credible, it means that theory must bend to evidence’s command — it means that theories can be wrong, at least in a particular time and place. And that means that every theory that can be checked with credible evidence needs to be checked before it’s put to use in real-world policymaking. Just like you wouldn’t prescribe patients a vaccine without testing it first. This is a very new way for economists to have to force themselves to think. But this is a field in its infancy — we’re still at the Francis Bacon/Galileo stage. Give it time.

In other words, new empirical techniques brought economics closer to following Michael Strevens’ iron rule of explanation: “that all arguments must be carried out with reference to empirical evidence.”

Quartz’s coverage of the Nobel is here and here.

A definition of culture

From an NBER review of the economics of company culture. The authors describe the varied ways “culture” has been defined, not just with respect to companies, and then offer this list:

A sensible list of elements in that package, though neither nearly exhaustive nor likely satisfactory to all, is as follows, adapted from a variety of such lists in the literature:

• unwritten codes, implicit rules, and regularities in interactions;

• identities,self-image, and guiding purpose;

• espoused values and evolving norms of behavior;

• conventions, customs, and traditions;

• symbols, signs, rituals, and group celebrations;

• knowledge, discourse, emergent understanding, doctrine, ideology;

• memes, jokes, style, and shared meaning;

• shared mental models, expectations, and linguistic paradigms.

Fixing the internet

The other day I rewatched one of my favorite talks about the internet, a 2015 lecture on algorithmic decisions by Jonathan Zittrain of Harvard Law School titled “Love the processor, hate the process.” Like all his talks, it’s funny, wide-ranging, and hard to summarize. But I think reflecting on it you can see him proposing a few categories of ways to fix what’s gone wrong with the internet:

  • regulation
  • competition
  • public goods and open standards

There’s so much wrong with the current internet and so many ideas floating around on what might be done about it that I find just these simple three buckets helpful in sorting out our choices. The fix, if there is one, will require some of all three.

On explanation

What makes a good explanation?

It’s not straightforward to provide an answer. Wikipedia says:

An explanation is a set of statements usually constructed to describe a set of facts which clarifies the causes, context, and consequences of those facts. This description may establish rules or laws, and may clarify the existing rules or laws in relation to any objects, or phenomena examined.

Philosophers, of course, have quite a lot more to say about the matter.

In this post I want to offer my own sketch, with an eye more toward the practical work of explanatory journalism than to philosophy. Wikipedia’s Causes, consequences, context has a nice alliterative ring to it so I’ll amend that to offer my own C’s of explanation.

Causes and consequences

A good explanation “fits the facts,” and suggests cause-effect relationships to make sense of them. Another way of saying that is, to borrow from pragmatist accounts of explanation, a good explanation should be “’empirically adequate’ (that is, that yield a true or correct description of observables).”

As for causes (of which consequences are one type), consider the difference between explanation and prediction. A forecaster might say a candidate has an 80% chance to win an election; their model “fits the facts.” But it does not say why. It offers no explanation because it has no causal content.

A good explanation allows for the consideration of at least one counterfactual. Max Weber wrote, about causality, that

“The attribution of effects to causes take place through a process of thought which includes a series of abstractions. The first decisive one occurs when we conceive of one or a few of the actual causal components as modified in a certain direction and then ask our selves whether under the conditions which have been thus changed, the same effect (the same, i.e. in ‘essential’ points) or some other effect ‘would be expected.'”

Max Weber: The interpretation of social reality, p. 20

Thinking about the counterfactual means breaking a problem up into components, and that requires concepts.

Concepts

A good explanation clearly defines its concepts, and chooses ones that are useful. Defining concepts helps the listener follow the explanation. Picking the right ones means choosing concepts that enable a more accurate and more useful causal model. The need for clear, useful concepts in explanation is central to the idea of “explainability” in machine learning.

A deep learning model might be extremely good at prediction; it fits the facts. And it might even seem to offer causal models: a causal effect is, statistically, just the difference between two conditional probabilities and some machine learning models can estimate causal effects reasonably accurately. But a deep learning model trained on individual pixels or characters won’t be interpretable or explainable. Its causal insights don’t always transfer easily into the heads of human beings. And that’s because it lacks easily defined, useful concepts. A deep learning model arguably “learns” meaningful notions of things as it translates pixels or characters, across layers, toward a prediction. But those notions aren’t recognizable concepts that people can work with. To make the model explainable, we need to provide concepts that people can make sense of and use.

Coherence

A good explanation is logical, or at least not illogical. An explanation links together concepts and facts into causal models in reasonable ways, without logical or mathematical error or contradiction. That’s easy enough to say; the question is what standard of logic we hold an explanation to. Must it come with a formal proof of its coherence? Or is some loose feeling that it “makes sense” enough? Deciding on that standard depends on context.

Context

A good explanation fits its context. It’s appropriate for its audience: A good explanation of macroeconomics is different if the audience is a four-year-old than if it is a college student. It includes the right (and right amount) of background information to help the audience understand what’s going to be explained. And it considers the goals of speaker, listener, and society at large. It aims to help actual people in the world achieve their purposes. That admittedly hazy criteria is the start for deciding what is good enough in terms of both empirical adequacy and coherence.

Summing up

So there it is. Pretty loose and subjective and imperfect, of course. But in my estimation a good explanation:

  • Fits the facts and proposes empirically plausible cause-effect relationships
  • Defines its terms and relies on concepts that feel useful and appropriate
  • Makes logical sense
  • Offers helpful background context and takes into account its audience

Durkheim on empiricism and economics

From 1938:

“The famous law of supply and demand for example, has never been inductively established, as should be the case with a law referring to economic reality. No experiment or systematic comparison has ever been undertaken for the purpose of establishing that in fact economic relations do conform to this law. All that these economists do, and actually did do, was to demonstrate by dialectics that, in order properly to promote their interests, individuals ought to proceed according to this law, and that every other line of action would be harmful to those who engage in it and would imply a serious error of judgement. It is fair and logical that the most productive industries should be the most attractive and that the holders of the products most in demand and most secure should sell them at the highest prices. But this quite logical necessity resembles in no way the true laws of nature present. The latter express the regulations according to which facts are really interconnected, not the way in which it is good that they should be interconnected.”

The Rules of Sociological Method, p. 26. Via Max Weber: The Interpretation of Social Reality, p. 18.

The political economy of attention

A review paper from NBER hits on something I’ve been thinking about lately: how media and attention factor into political economy.

How do groups of people coordinate to take political action? When are they able to overcome free rider problems? These are central questions in political economy, and one line of thinking says that smaller, organized groups will have an easier time than larger, diffuse groups.

From that you get the notion of distributed benefits and diffuse costs, and vice versa: when a small, organized group bears most of the benefits or costs of a policy, they tend to get their way, even if, on net, it’s a bad policy. Loose banking regulations have concentrated benefits (for bankers) and diffuse costs (to the public). It’s easier for banks to coordinate and hire a lobbyist than for the public to pay close attention to banking regulation. Even if, on net, loose banking regulations are a bad idea, the banks have more motivation and an easier time organizing and so they get what they want.

Except it doesn’t always work that way. Sometimes the media directs enough public attention to an issue that the diffuse public prevails over concentrated, organized interests. And that suggests a big role for models of attention in political economy: when do people pay enough attention and care enough about something to overcome the difficulty that diffuse groups face in politics?

The paper is mostly an empirical review but they have a basic model in which people are trying to decide if it’s worth their time to invest in political action. That involves gauging how likely other people are to take that action, too, and what kind of information they get from the media matters:

“The first key lesson is: the role of media in spreading information may facilitate or hinder collective action, depending on the content of that information…

There is a second key lesson: the effectiveness of the media in spreading information eventually facilitates collective action…

Our third key lesson: homophily in social networks dampens the effect of information on collective action.”

Basically everyone is looking for evidence that other people are willing to participate, too. Media gives them hints as to whether that’s the case or not; the more the media gives you a sense that others will join, the more likely you are to join. The more your network is just filled with people like you, the less confident you are that the information you’re getting is actually a clue about whether others will join (maybe you’re just being fed the small subset of people who are like you).

This is a great topic, and there’s clearly something to this model: you probably are more likely to join the cause if you think there’s lots of energy around it and a realistic chance of success.*

But would you really be put off by the fact that the information you were getting from media was reflecting back just what people like you thought? One, it’s hard to think of someone reflecting on that and then deciding the prudent thing was to discount the quality of the information. Two, for most people the fact that lots of people like you are participating is probably a reason they would choose to participate. Everyone who cares about what you care about will be there: that’s a reason for most people to join, not to say ‘That makes me uncertain of our prospects.’

And that gets to my skepticism about this model. Why model attention rationally in the first place? What if we thought about media and attention as non-rational way that people overcame the selfish desire to free ride? Usually it doesn’t make narrowly selfish, ‘rational’ sense to put in the time for some cause where the benefits are diffuse (opposition to loose banking regulation!) but people don’t just make that sort of decision rationally. They decide in part based on emotion, social cues, and a sense of identity.

In the notes for his political economy course, Daron Acemoglu describes the problem diffuse groups face:

All individuals within the social groups must find a profitable to take the same actions, and often, take actions that are in the interest of the group as a whole. This leads to what Olson has termed the “free rider” problem: individuals may free ride and not undertake actions that are costly for themselves but beneficial for the group. Therefore, any model that uses social groups as the actor must implicitly use a way of solving the free-rider problem. The usual solutions are
• Ideology: groups may develop an ideology that makes individuals derive utility from following the group’s interests.
• Repeated interactions: if individuals within groups interact more often with each other, certain punishment mechanisms may be available to groups to coerce members to follow the group’s interests.
• Exclusion: certain groups might arrange the benefits from group action such that those who free ride do not receive the benefits of group action.
[…Currently, there is little systematic work in economics on how social groups solve the free-rider problem, and this may be an important area for future work…]

The direction I’m thinking comes closest to “ideology”: media taps into individuals’ emotions and sense of identity in ways that make them more likely to participate. You can write that down as a model of utility maximizing with preferences, if you must, but it’s not mostly about gauging the likelihood of success or whether people not like you will contribute.

The question to my way of thinking is why some policy areas capture attention and so make it easier for the public to overcome free rider problems. The minutiae of banking regulation, for example, doesn’t seem to lend itself to that sort of attention-driven coordination; even a couple years after the financial crisis banks were able to defang aspects of Dodd-Frank with out much media attention or uproar.

But the sort of emotion- and affinity-driven model of attention I’m gesturing toward would help understand whether, say, YIMBYism can succeed. That’s a classic political economy problem: a concentrated cohort of property owners benefit from limits on construction while a diffuse group (including people who don’t yet live in the city) would benefit from more building. The property owners show up to all the meetings because they have so much at stake. Can the YIMBY movement overcome that?

The NBER paper’s model would say it depends on whether renters think other renters care. And also that if they think all the loud YIMBYs on Twitter aren’t representative of the public they’ll rationally discount the strength of that as a signal.

Whereas I’d say the question for YIMBYism is whether it can develop an emotional appeal, build a community people want to be a part of, and become a marker of identity and status. Either it’s a movement that sustains attention or it isn’t.

That’s what I’d like to see: a behavioral model of attention, and then study of why different issues so and don’t capture it.

*In some other models this may just make you want to free ride. Oddly that’s not really discussed much; there’s only one mention of free riding in the paper.

Software, management, competition

Software startups often target applications that many companies share – accounting, human resources, communications, etc. Companies want to digitize by purchasing off-the-shelf software. No one creates software for processes that underly their unique competitive advantages. They buy excess capacity in departments that aren’t their core business instead.

That’s one of many interesting bits from this post on software as management. I can’t recall where I came across it and don’t really know who the author is.

But it relates to the piece I wrote a few years back with James Bessen for HBR. In it, we linked firms’ software capabilities and startups’ ability to create new organizational architectures to the rise of large firms.

Software is at the center of competition between firms but many firms lack the ability and/or the incentive to adopt software in ways that actually give them competitive advantage.

Notes on innovation economics

This post is just to link together a few resources I want to keep track of, occasioned by the publication of a concise review of innovation economics by NBER this week.

Some posts on the “innovation agenda” here and here.

  • Update: Adding this overview post from New Things Under the Sun.
  • Bias in the market for change

    Earlier this year I wrote about loss aversion and politics. Here’s a quick snippet on this from Felix Oberholzer-Gee’s excellent new book Better, Simpler Strategy. He’s covering three cases of technological change (radio, PCs, and ATMs) and notes that while they were expected to be pure substitutes for records, paper, and bank tellers, respectively, they were actually complements and increased demand for those things:

    Did you notice a pattern in the three examples? In each instance, we predicted substitution when in fact the new technology turned out to increase the willingness-to-pay for existing products and activities. This type of bias is the norm. We fear change; potential losses loom larger than similar gains, a phenomenon that psychologists Amos Tversky and Daniel Kahneman call loss aversion. Loss aversion keeps us preoccupied with the risk of substitution even when we look at complementarities.

    p. 81

    Loss aversion doesn’t just change the politics of change, but the market for it, too.

    More on social epistemology

    A few weeks back I wrote about the importance of social learning. Yes, it’s important to try to think clearly and logically where you can, but in practice we’re mostly forced to rely on cues from others to reach our beliefs.

    Will Wilkinson makes this point well and in much greater depth in a recent post on conspiracy and epistemology. Along the way he highlights where he breaks from the “rationalist” community:

    Now, I’ve come to think that people who really care about getting things right are a bit misguided when they focus on methods of rational cognition. I’m thinking of the so-called “rationalist” community here. If you want an unusually high-fidelity mental model of the world, the main thing isn’t probability theory or an encyclopedic knowledge of the heuristics and biases that so often make our reasoning go wrong. It’s learning who to trust. That’s really all there is to it. That’s the ballgame…

    It’s really not so hard. In any field, there are a bunch of people at the top of the game who garner near-universal deference. Trusting those people is an excellent default. On any subject, you ought to trust the people who have the most training and spend the most time thinking about that subject, especially those who are especially well-regarded by the rest of these people.

    I mostly agree: this is the point I was trying to make in my post on social learning.

    But for the sake of argument we should consider the rationalist’s retort. Like at least some corners of the rationalist community, I’m a fan of Tetlock’s forecasting research and think it has a lot to teach us about epistemology in practice. But Tetlock found that experts aren’t necessarily that great at reaching accurate beliefs about the future, and that a small number of “superforecasters” seem, on average, to outperform the experts.

    Is Wilkinson wrong? Might the right cognitive toolkit (probability, knowledge of biases, etc.) be better than deferring to experts?

    I think not, for a couple reasons. First off, sure some people are better than experts at certain forms of reasoning, but what makes you think that’s you? I’ve done forecasting tournaments; they’re really hard. Understanding Bayesian statistics does not mean you’re a superforecaster with a track record of out-reasoning others. Unless you’ve proven it, it’s hubris to think you’re better than the experts.

    I’d also argue that the superforecasters are largely doing a form of what Wilkinson is suggesting, albeit with extra stuff on top. Their key skill is arguably figuring out who and what to trust. Yes, they’re also good at probabilistic thinking and think about their own biases, but they’re extremely good information aggregators.

    And that leads me to maybe my key clarification on Wilkinson. He says:

    A solid STEM education isn’t going to help you and “critical thinking” classes will help less than you’d think. It’s about developing a bullshit detector — a second sense for the subtle sophistry of superficially impressive people on the make. Collecting people who are especially good at identifying trustworthiness and then investing your trust in them is our best bet for generally being right about things.

    I’d put it a bit differently. If by critical thinking he means basically logical reasoning class then sure I am with him. What you need is not just the tools to reason yourself: you need to learn how to figure out who to trust. So far, so good.

    But I wouldn’t call this a “bullshit detector” exactly, though of course that’s nice to have. Another key lesson from the Tetlock research (and I think confirmed elsewhere) is that a certain sort of open-mindedness is extremely valuable–you want to be a “many model” thinker who considers and balances multiple explanations when thinking about a topic.

    That’s the key part of social learning that I’d emphasize. You want to look for people who think clearly but with nuance (it’s easy to have one but not both), who seriously consider other perspectives, and who are self-critical. Ideally, you want to defer to those people. And if you can’t find them, you want to perform some mental averaging over the perspectives of everyone else.

    Best case, you find knowledgeable “foxes” and defer to them. Failing that, you add a bit of your own fox thinking on top of what you’re hearing.

    Doing that well has almost nothing to do with Bayes’ theorem. Awareness of your own biases can, I think, help–though it doesn’t always. And knowledge of probability is often useful. But reaching true beliefs is, in practice, still a social activity. Like Wilkinson says, it’s mostly a matter of trust.