STA6349: Applied Bayesian Analysis
Last week, we discussed balance in Bayesian analyses.
We will now discuss sequentiality in Bayesian analyses.
Working example:
Recall the Bechdel test: In Alison Bechdel’s 1985 comic strip The Rule, a character states that they only see a movie if it satisfies the following three rules (Bechdel 1986):
Let’s now turn our thinking to - okay, we’ve updated our beliefs… but now we have new data!
The evolution in our posterior understanding happens incrementally, as we accumulate new data.
Scientists’ understanding of climate change has evolved over the span of decades as they gain new information.
Presidential candidates’ understanding of their chances of winning an election evolve over months as new poll results become available.
Let’s revisit Milgram’s behavioral study of obedience from Chapter 3. Recall, \pi represents the proportion of people that will obey authority, even if it means bringing harm to others.
Prior to Milgram’s experiments, our fictional psychologist expected that few people would obey authority in the face of harming another: \pi \sim \text{Beta}(1,10).
Now, suppose that the psychologist collected the data incrementally, day by day, over a three-day period.
Find the following posterior distributions, each building off the last:
Let’s revisit Milgram’s behavioral study of obedience from Chapter 3. Recall, \pi represents the proportion of people that will obey authority, even if it means bringing harm to others.
Prior to Milgram’s experiments, our fictional psychologist expected that few people would obey authority in the face of harming another: \pi \sim \text{Beta}(1,10).
Now, suppose that the psychologist collected the data incrementally, day by day, over a three-day period.
Find the following posterior distributions, each building off the last:
Recall from Chapter 3, our posterior was \text{Beta}(27,24)!
Sequential Bayesian analysis (aka Bayesian learning):
In a sequential Bayesian analysis, a posterior model is updated incrementally as more data come in.
With each new piece of data, the previous posterior model reflecting our understanding prior to observing this data becomes the new prior model.
This is why we love Bayesian!
These types of sequential analyses also uphold two fundamental properties:
Data order invariance:
Let \theta be any parameter of interest with prior pdf f(\theta).
Then a sequential analysis in which we first observe a data point y_1, and then a second data point y_2 will produce the same posterior model of \theta as if we first observe y_2 and then y_1.
f(\theta|y_1,y_2) = f(\theta|y_2,y_1)
Let’s first specify the structure of posterior pdf f(\theta|y_1,y_2), which evolves by sequentially observing data y_1, followed by y_2.
In step one, we construct the posterior pdf from our original prior pdf, f(\theta), and the likelihood function of \theta given the first data point y_1, L(\theta|y_1).
\begin{align*} f(\theta|y_1) &= \frac{\text{prior} \cdot \text{likelihood}}{\text{normalizing constant}} \\ &= \frac{f(\theta)L(\theta|y_1)}{f(y_1)} \end{align*}
In step two, we update our model in light of observing new data, y_2.
\begin{align*} f(\theta|y_2) &= \frac{\text{prior} \cdot \text{likelihood}}{\text{normalizing constant}} \\ &= \frac{\frac{f(\theta)L(\theta|y_1)}{f(y_1)}L(\theta|y_2)}{f(y_2)} \\ &= \frac{f(\theta)L(\theta|y_1)L(\theta|y_2)}{f(y_1)f(y_2)} \end{align*}
\begin{align*} f(\theta|y_2) &= \frac{\text{prior} \cdot \text{likelihood}}{\text{normalizing constant}} \\ &= \frac{f(\theta)L(\theta|y_2)}{f(y_2)} \end{align*}
\begin{align*} f(\theta|y_1) &= \frac{\text{prior} \cdot \text{likelihood}}{\text{normalizing constant}} \\ &= \frac{\frac{f(\theta)L(\theta|y_2)}{f(y_2)}L(\theta|y_1)}{f(y_1)} \\ &= \frac{f(\theta)L(\theta|y_2)L(\theta|y_1)}{f(y_2)f(y_1)} \end{align*}
Finally, not only does the order of the data not influence the ultimate posterior model of \theta, but it doesn’t matter whether we observe the data all at once or sequentially.
Suppose we start with the original f(\theta) prior and observe data (y_1, y_2) together, not sequentially.
Further, assume that these data points are independent, thus,
f(y_1, y_2) = f(y_1) f(y_2) \text{ and } f(y_1,y_2|\theta) = f(y_1|\theta) f(y_2|\theta)
\begin{align*} f(\theta|y_1,y_2) &= \frac{f(\theta)L(\theta|y_1,y_2)}{f(y_1,y_2)} \\ &= \frac{f(\theta)f(y_1,y_2|\theta)}{f(y_1)f(y_2)} \\ &= \frac{f(\theta)f(y_1|\theta)f(y_2|\theta)}{f(y_1)f(y_2)} \\ &= \frac{f(\theta)L(\theta|y_1)L(\theta|y_2)}{f(y_1)f(y_2)} \end{align*}
Today we have covered the latter part of Chapter 4.
Please work on Homework problems 17 and 18 for the remainder of lecture.
On Wednesday, we will complete our in class assignment.
Next week, we will not have formal lecture (i.e., no Zoom meetings) - we will be working on the first project.
There will be a presentation component to the project!
Please be prepared for presenting on Monday, 10/21.
4.15
4.16
4.17
4.18
4.19