Probabilistic Models of Evidence

In what follows, we’ll consider three approaches to representing evidential relationships between propositions: the standard Bayesian definition of evidence, and two versions of another probabilistic representation of evidence called the likelihood principle. All of these approaches employ probabilistic models—that is, they use the mathematics of probability to represent how one proposition can provide evidence for another. The likelihood principle is a reformulation of the Bayesian model, not an alternative to Bayesianism, but it provides a more illuminating view of the role evidence plays in many situations. Each of these models has its own advantages in different applications, so it is worthwhile to study all three approaches.

Approach 1: The Bayesian Definition of Evidence

As explained in the previous chapter, Bayesian confirmation theory defines confirmation in terms of the conditionalization rule, which says that upon learning E, your unconditional credence in H should be updated to match your prior conditional credence in H given E. If conditionalizing on E increases your credence in H, we say that E confirms H. The word ‘confirm’ in this context just means that E provides evidence for H. Thus, we already have a Bayesian definition for the concept of evidence:

The Bayesian Definition of Evidence: A person regards E as evidence for hypothesis H if and only if she thought, prior to learning E, that H is more likely to be true given the assumption that E is true: pr(H|E) > pr(H).

All of the probabilities shown on this page are prior credences. To simplify notation, we’ll drop the subscript ‘1’ from the probability functions. For instance, instead of writing ‘pr1(H)’ for the prior probability of H, we’ll just write ‘pr(H).’

In other words, you regard some fact E as evidence for H if and only if learning E raises your credence in H. Equivalently, we can say that you regard E as evidence for H if and only if the Bayesian multiplier, pr(E|H)/pr(E), is greater than 1. Remember, the Bayesian multiplier is the factor by which your credence in hypothesis H increases or decreases when you learn E. So, if the Bayesian multiplier is greater than 1, your credence in H will increase.

Notice that the Bayesian multiplier is greater than 1 if and only if pr(E|H) > pr(E). Moreover, this inequality looks almost like the one in the Bayesian definition of evidence, except that E and H have switched places! This means that if E is evidence for H according to the Bayesian definition, then H can also be regarded as evidence for E. The evidential relationship between E and H is symmetrical.

Approach 2: The Likelihood Principle (comparative version)

Another way of characterizing evidential relationships is called the likelihood principle or the law of likelihood. It provides a helpful way of understanding the role of evidence when we want to judge between two or more competing hypotheses rather than considering only a single hypothesis. The likelihood principle says that evidence E favors whichever hypothesis makes E more likely:

The Likelihood Principle: Evidence E favors hypothesis H1 over H2 if and only if pr(E|H1) > pr(E|H2).

In other words, if E is more likely (more expected, or less surprising) if H1 is true than if H2 is true, then E is evidence in favor of H1 over H2.

Notice that this is a comparative notion of evidence: the likelihood principle does not indicate whether E is evidence for H1 tout court, but only whether evidence E “favors” hypothesis H1 over a certain alternative H2. To say that E favors H1 over H2 means that the ratio of your credences in these two propositions should shift in favor of H1. That is, the ratio pr(H1)/pr(H2) should increase upon learning E. This could happen even while both credences decrease! In other words, even if E is evidence against both H1 and H2, it could still favor H1 over H2.

Approach 3: The Likelihood Principle (non-comparative version)

The comparative version of the likelihood principle, defined above, does not tell us whether E is evidence for H1 by itself. (In fact, as just explained, it could be evidence against H1 while still favoring H1 over H2.) However, we can get a non-comparative version of the principle, simply by comparing a hypothesis with its own negation. In other words, we can use H and ~H as the two hypotheses in the likelihood principle. This yields the following version of the principle:

Non-comparative version of the Likelihood Principle: E is evidence for H if and only if pr(E|H) > pr(E|~H).

This version of the likelihood principle, unlike the comparative version, does tell us whether E is evidence for H. The resulting definition of evidence is consistent with the standard Bayesian definition given above, in the sense that both definitions agree on which propositions count as evidence for which hypotheses. Any proposition that counts as evidence for hypothesis H according to the Bayesian definition will also count as evidence according to the likelihood principle, and vice versa. In fact, the non-comparative version of the likelihood principle can be derived mathematically from the Bayesian definition of evidence.

Although these two definitions of ‘evidence’ are essentially equivalent, each definition has its own advantages. When applying the standard Bayesian definition of evidence, we have to ask “does the evidence make the hypothesis more likely?” However, to apply the likelihood principle, it’s the other way around: “does the hypothesis make the evidence more likely?” Depending on the situation, one of these questions may be easier to answer than the other.