Sequentiality in Bayesian Analyses

STA6349: Applied Bayesian Analysis

Introduction

Last week, we discussed balance in Bayesian analyses.
We will now discuss sequentiality in Bayesian analyses.
Working example:
- Recall the Bechdel test: In Alison Bechdel’s 1985 comic strip The Rule, a character states that they only see a movie if it satisfies the following three rules (Bechdel 1986):
  1. the movie has to have at least two women in it;
  2. these two women talk to each other; and
  3. they talk about something besides a man.

Introduction

Let’s now turn our thinking to - okay, we’ve updated our beliefs… but now we have new data!
The evolution in our posterior understanding happens incrementally, as we accumulate new data.
- Scientists’ understanding of climate change has evolved over the span of decades as they gain new information.
- Presidential candidates’ understanding of their chances of winning an election evolve over months as new poll results become available.

Introduction

Let’s revisit Milgram’s behavioral study of obedience from Chapter 3. Recall, \pi represents the proportion of people that will obey authority, even if it means bringing harm to others.
Prior to Milgram’s experiments, our fictional psychologist expected that few people would obey authority in the face of harming another: \pi \sim \text{Beta}(1,10).
Now, suppose that the psychologist collected the data incrementally, day by day, over a three-day period.
Find the following posterior distributions, each building off the last:
- Day 0: \text{Beta}(1,10).
- Day 1: Y=1 out of n=10.
- Day 2: Y=17 out of n=20.
- Day 3: Y=8 out of n=10.

Introduction

Let’s revisit Milgram’s behavioral study of obedience from Chapter 3. Recall, \pi represents the proportion of people that will obey authority, even if it means bringing harm to others.
Prior to Milgram’s experiments, our fictional psychologist expected that few people would obey authority in the face of harming another: \pi \sim \text{Beta}(1,10).
Now, suppose that the psychologist collected the data incrementally, day by day, over a three-day period.
Find the following posterior distributions, each building off the last:
- Day 0: \text{Beta}(1,10).
- Day 1: Y=1 out of n=10 \text{Beta}(1,10) \to \text{Beta}(2, 19).
- Day 2: Y=17 out of n=20 \text{Beta}(2, 19) \to \text{Beta}(19, 22).
- Day 3: Y=8 out of n=10 \text{Beta}(19, 22) \to \text{Beta}(27, 24).
Recall from Chapter 3, our posterior was \text{Beta}(27,24)!

Sequential Bayesian Analysis or Bayesian Learning

Sequential Bayesian analysis (aka Bayesian learning):
- In a sequential Bayesian analysis, a posterior model is updated incrementally as more data come in.
- With each new piece of data, the previous posterior model reflecting our understanding prior to observing this data becomes the new prior model.
This is why we love Bayesian!
- We evolve our thinking as new data come in.
These types of sequential analyses also uphold two fundamental properties:
1. The final posterior model is data order invariant,
2. The final posterior only depends upon the cumulative data.

Sequential Bayesian Analysis or Bayesian Learning

In order:
- Day 0: \text{Beta}(1,10).
- Day 1: Y=1 out of n=10 \text{Beta}(1,10) \to \text{Beta}(2, 19).
- Day 2: Y=17 out of n=20 \text{Beta}(2, 19) \to \text{Beta}(19, 22).
- Day 3: Y=8 out of n=10 \text{Beta}(19, 22) \to \text{Beta}(27, 24).
Out of order:
- Day 0: \text{Beta}(1,10).
- Day 3: Y=8 out of n=10 \text{Beta}(1,10) \to \text{Beta}(9, 12).
- Day 2: Y=17 out of n=20 \text{Beta}(9, 12) \to \text{Beta}(26, 15).
- Day 1: Y=1 out of n=10 \text{Beta}(26, 15) \to \text{Beta}(27, 24).

Sequential Bayesian Analysis or Bayesian Learning

Proving Data Order Invariance

Data order invariance:
- Let \theta be any parameter of interest with prior pdf f(\theta).
- Then a sequential analysis in which we first observe a data point y_1, and then a second data point y_2 will produce the same posterior model of \theta as if we first observe y_2 and then y_1.

f(\theta|y_1,y_2) = f(\theta|y_2,y_1)

Similarly, the posterior model is invariant to whether we observe the data all at once or sequentially.

Proving Data Order Invariance

Let’s first specify the structure of posterior pdf f(\theta|y_1,y_2), which evolves by sequentially observing data y_1, followed by y_2.
In step one, we construct the posterior pdf from our original prior pdf, f(\theta), and the likelihood function of \theta given the first data point y_1, L(\theta|y_1).

\begin{align*} f(\theta|y_1) &= \frac{\text{prior} \cdot \text{likelihood}}{\text{normalizing constant}} \\ &= \frac{f(\theta)L(\theta|y_1)}{f(y_1)} \end{align*}

Proving Data Order Invariance

In step two, we update our model in light of observing new data, y_2.
- Don’t forget that we start from the prior model specified by f(\theta|y_1).

\begin{align*} f(\theta|y_2) &= \frac{\text{prior} \cdot \text{likelihood}}{\text{normalizing constant}} \\ &= \frac{\frac{f(\theta)L(\theta|y_1)}{f(y_1)}L(\theta|y_2)}{f(y_2)} \\ &= \frac{f(\theta)L(\theta|y_1)L(\theta|y_2)}{f(y_1)f(y_2)} \end{align*}

Proving Data Order Invariance

What happens when we observe the data in the opposite order?

Proving Data Order Invariance

What happens when we observe the data in the opposite order?

\begin{align*} f(\theta|y_2) &= \frac{\text{prior} \cdot \text{likelihood}}{\text{normalizing constant}} \\ &= \frac{f(\theta)L(\theta|y_2)}{f(y_2)} \end{align*}

\begin{align*} f(\theta|y_1) &= \frac{\text{prior} \cdot \text{likelihood}}{\text{normalizing constant}} \\ &= \frac{\frac{f(\theta)L(\theta|y_2)}{f(y_2)}L(\theta|y_1)}{f(y_1)} \\ &= \frac{f(\theta)L(\theta|y_2)L(\theta|y_1)}{f(y_2)f(y_1)} \end{align*}

Proving Data Order Invariance

Finally, not only does the order of the data not influence the ultimate posterior model of \theta, but it doesn’t matter whether we observe the data all at once or sequentially.
Suppose we start with the original f(\theta) prior and observe data (y_1, y_2) together, not sequentially.
Further, assume that these data points are independent, thus,

f(y_1, y_2) = f(y_1) f(y_2) \text{ and } f(y_1,y_2|\theta) = f(y_1|\theta) f(y_2|\theta)

Proving Data Order Invariance

Then, the posterior pdf is the same as the one resulting from sequential analysis,

\begin{align*} f(\theta|y_1,y_2) &= \frac{f(\theta)L(\theta|y_1,y_2)}{f(y_1,y_2)} \\ &= \frac{f(\theta)f(y_1,y_2|\theta)}{f(y_1)f(y_2)} \\ &= \frac{f(\theta)f(y_1|\theta)f(y_2|\theta)}{f(y_1)f(y_2)} \\ &= \frac{f(\theta)L(\theta|y_1)L(\theta|y_2)}{f(y_1)f(y_2)} \end{align*}

Wrap Up

Today we have covered the latter part of Chapter 4.
Please work on Homework problems 17 and 18 for the remainder of lecture.
- We will reconvene at 5:10 to check in.
On Wednesday, we will complete our in class assignment.
Next week, we will not have formal lecture (i.e., no Zoom meetings) - we will be working on the first project.
- There will be a presentation component to the project!
- Please be prepared for presenting on Monday, 10/21.

Homework

4.15
4.16
4.17
4.18
4.19