Sequentiality in Bayesian Analyses

STA6349: Applied Bayesian Analysis

Introduction

  • Last week, we discussed balance in Bayesian analyses.

  • We will now discuss sequentiality in Bayesian analyses.

  • Working example:

    • Recall the Bechdel test: In Alison Bechdel’s 1985 comic strip The Rule, a character states that they only see a movie if it satisfies the following three rules (Bechdel 1986):

      1. the movie has to have at least two women in it;
      2. these two women talk to each other; and
      3. they talk about something besides a man.

Introduction

  • Let’s now turn our thinking to - okay, we’ve updated our beliefs… but now we have new data!

  • The evolution in our posterior understanding happens incrementally, as we accumulate new data.

    • Scientists’ understanding of climate change has evolved over the span of decades as they gain new information.

    • Presidential candidates’ understanding of their chances of winning an election evolve over months as new poll results become available.

Introduction

  • Let’s revisit Milgram’s behavioral study of obedience from Chapter 3. Recall, \pi represents the proportion of people that will obey authority, even if it means bringing harm to others.

  • Prior to Milgram’s experiments, our fictional psychologist expected that few people would obey authority in the face of harming another: \pi \sim \text{Beta}(1,10).

  • Now, suppose that the psychologist collected the data incrementally, day by day, over a three-day period.

  • Find the following posterior distributions, each building off the last:

    • Day 0: \text{Beta}(1,10).
    • Day 1: Y=1 out of n=10.
    • Day 2: Y=17 out of n=20.
    • Day 3: Y=8 out of n=10.

Introduction

  • Let’s revisit Milgram’s behavioral study of obedience from Chapter 3. Recall, \pi represents the proportion of people that will obey authority, even if it means bringing harm to others.

  • Prior to Milgram’s experiments, our fictional psychologist expected that few people would obey authority in the face of harming another: \pi \sim \text{Beta}(1,10).

  • Now, suppose that the psychologist collected the data incrementally, day by day, over a three-day period.

  • Find the following posterior distributions, each building off the last:

    • Day 0: \text{Beta}(1,10).
    • Day 1: Y=1 out of n=10 \text{Beta}(1,10) \to \text{Beta}(2, 19).
    • Day 2: Y=17 out of n=20 \text{Beta}(2, 19) \to \text{Beta}(19, 22).
    • Day 3: Y=8 out of n=10 \text{Beta}(19, 22) \to \text{Beta}(27, 24).
  • Recall from Chapter 3, our posterior was \text{Beta}(27,24)!

Sequential Bayesian Analysis or Bayesian Learning

  • Sequential Bayesian analysis (aka Bayesian learning):

    • In a sequential Bayesian analysis, a posterior model is updated incrementally as more data come in.

    • With each new piece of data, the previous posterior model reflecting our understanding prior to observing this data becomes the new prior model.

  • This is why we love Bayesian!

    • We evolve our thinking as new data come in.
  • These types of sequential analyses also uphold two fundamental properties:

    1. The final posterior model is data order invariant,
    2. The final posterior only depends upon the cumulative data.

Sequential Bayesian Analysis or Bayesian Learning

  • In order:
    • Day 0: \text{Beta}(1,10).
    • Day 1: Y=1 out of n=10 \text{Beta}(1,10) \to \text{Beta}(2, 19).
    • Day 2: Y=17 out of n=20 \text{Beta}(2, 19) \to \text{Beta}(19, 22).
    • Day 3: Y=8 out of n=10 \text{Beta}(19, 22) \to \text{Beta}(27, 24).
  • Out of order:
    • Day 0: \text{Beta}(1,10).
    • Day 3: Y=8 out of n=10 \text{Beta}(1,10) \to \text{Beta}(9, 12).
    • Day 2: Y=17 out of n=20 \text{Beta}(9, 12) \to \text{Beta}(26, 15).
    • Day 1: Y=1 out of n=10 \text{Beta}(26, 15) \to \text{Beta}(27, 24).

Sequential Bayesian Analysis or Bayesian Learning

Proving Data Order Invariance

  • Data order invariance:

    • Let \theta be any parameter of interest with prior pdf f(\theta).

    • Then a sequential analysis in which we first observe a data point y_1, and then a second data point y_2 will produce the same posterior model of \theta as if we first observe y_2 and then y_1.

f(\theta|y_1,y_2) = f(\theta|y_2,y_1)

  • Similarly, the posterior model is invariant to whether we observe the data all at once or sequentially.

Proving Data Order Invariance

  • Let’s first specify the structure of posterior pdf f(\theta|y_1,y_2), which evolves by sequentially observing data y_1, followed by y_2.

  • In step one, we construct the posterior pdf from our original prior pdf, f(\theta), and the likelihood function of \theta given the first data point y_1, L(\theta|y_1).

\begin{align*} f(\theta|y_1) &= \frac{\text{prior} \cdot \text{likelihood}}{\text{normalizing constant}} \\ &= \frac{f(\theta)L(\theta|y_1)}{f(y_1)} \end{align*}

Proving Data Order Invariance

  • In step two, we update our model in light of observing new data, y_2.

    • Don’t forget that we start from the prior model specified by f(\theta|y_1).

\begin{align*} f(\theta|y_2) &= \frac{\text{prior} \cdot \text{likelihood}}{\text{normalizing constant}} \\ &= \frac{\frac{f(\theta)L(\theta|y_1)}{f(y_1)}L(\theta|y_2)}{f(y_2)} \\ &= \frac{f(\theta)L(\theta|y_1)L(\theta|y_2)}{f(y_1)f(y_2)} \end{align*}

Proving Data Order Invariance

  • What happens when we observe the data in the opposite order?

Proving Data Order Invariance

  • What happens when we observe the data in the opposite order?

\begin{align*} f(\theta|y_2) &= \frac{\text{prior} \cdot \text{likelihood}}{\text{normalizing constant}} \\ &= \frac{f(\theta)L(\theta|y_2)}{f(y_2)} \end{align*}

\begin{align*} f(\theta|y_1) &= \frac{\text{prior} \cdot \text{likelihood}}{\text{normalizing constant}} \\ &= \frac{\frac{f(\theta)L(\theta|y_2)}{f(y_2)}L(\theta|y_1)}{f(y_1)} \\ &= \frac{f(\theta)L(\theta|y_2)L(\theta|y_1)}{f(y_2)f(y_1)} \end{align*}

Proving Data Order Invariance

  • Finally, not only does the order of the data not influence the ultimate posterior model of \theta, but it doesn’t matter whether we observe the data all at once or sequentially.

  • Suppose we start with the original f(\theta) prior and observe data (y_1, y_2) together, not sequentially.

  • Further, assume that these data points are independent, thus,

f(y_1, y_2) = f(y_1) f(y_2) \text{ and } f(y_1,y_2|\theta) = f(y_1|\theta) f(y_2|\theta)

Proving Data Order Invariance

  • Then, the posterior pdf is the same as the one resulting from sequential analysis,

\begin{align*} f(\theta|y_1,y_2) &= \frac{f(\theta)L(\theta|y_1,y_2)}{f(y_1,y_2)} \\ &= \frac{f(\theta)f(y_1,y_2|\theta)}{f(y_1)f(y_2)} \\ &= \frac{f(\theta)f(y_1|\theta)f(y_2|\theta)}{f(y_1)f(y_2)} \\ &= \frac{f(\theta)L(\theta|y_1)L(\theta|y_2)}{f(y_1)f(y_2)} \end{align*}

Wrap Up

  • Today we have covered the latter part of Chapter 4.

  • Please work on Homework problems 17 and 18 for the remainder of lecture.

    • We will reconvene at 5:10 to check in.
  • On Wednesday, we will complete our in class assignment.

  • Next week, we will not have formal lecture (i.e., no Zoom meetings) - we will be working on the first project.

    • There will be a presentation component to the project!

    • Please be prepared for presenting on Monday, 10/21.

Homework

  • 4.15

  • 4.16

  • 4.17

  • 4.18

  • 4.19