STA6349: Applied Bayesian Analysis
Spring 2025
Last week, we talked about creating posterior models for discrete priors (non-named distributions).
This week, we will now introduce having a named distribution as a prior.
We will start with analyzing a binomial outcome.
Past polls provide prior information about \pi, the proportion of Floridians that currently support Michelle.
In a previous problem, we assumed that \pi could only be 0.2, 0.5, or 0.8, the corresponding chances of which were defined by a discrete probability model.
We can reflect this reality and conduct a Bayesian analysis by constructing a continuous prior probability model of \pi.
A reasonable prior is represented by the curve on the right.
In building the Bayesian election model of Michelle’s election support among Floridians, \pi, we begin with the prior.
What values can \pi take and which are more plausible than others?
Let \pi be a random variable, where \pi \in [0, 1].
The variability in \pi may be captured by a Beta model with shape hyperparameters \alpha > 0 and \beta > 0,
\pi \sim \text{Beta}(\alpha, \beta),
Your turn!
Explore the following and report back:
We can tune the shape hyperparameters (\alpha and \beta) to reflect our prior information about Michelle’s election support, \pi.
In our example, we saw that she polled between 25 and 65 percentage points, with an average of 45 percentage points.
E[\pi] = \frac{\alpha}{\alpha+\beta} \approx 0.45
\alpha \approx \frac{9}{11} \beta
Your turn!
\pi \sim \text{Beta}(45, 55)
\begin{equation*} \begin{aligned} E[\pi] &= \frac{\alpha}{\alpha + \beta} & \text{ and } & \text{ } & \text{ } \\ &=\frac{45}{45+55} \\ &= 0.45 \end{aligned} \begin{aligned} \text{var}[\pi] &= \frac{\alpha\beta}{(\alpha+\beta)^2(\alpha+\beta+1)} \\ &= \frac{(45)(55)}{(45+55)^2(45+55+1)} \\ &= 0.0025 \end{aligned} \end{equation*}
Now we are ready to think about the data collection.
A new poll of n = 50 Floridians recorded Y, the number that support Michelle.
To model the dependence of Y on \pi, we assume
This is a binomial event, Y|\pi \sim \text{Bin}(50, \pi), with conditional pmf, f(y|\pi) defined for y \in \{0, 1, ..., 50\}
f(y|\pi) = P[Y = y|\pi] = {50 \choose y} \pi^y (1-\pi)^{50-y}
The conditional pmf, f(y|\pi), gives us answers to a hypothetical question:
Let’s look at this graphically:
It is observed that Y=30 of the n=50 polled voters support Michelle.
We now want to find the likelihood function – remember that we treat Y=30 as the observed data and \pi as unknown,
\begin{align*} f(y|\pi) &= {50 \choose y} \pi^y (1-\pi)^{50-y} \\ L(\pi|y=30) &= {50 \choose 30} \pi^{30} (1-\pi)^{20} \end{align*}
Challenge!
Create a graph showing what happens to the likelihood for different values of \pi.
To get you started,
\begin{align*} Y|\pi &\sim \text{Bin}(50, \pi) \\ \pi &\sim \text{Beta}(45, 55) \end{align*}
We can see that the posterior model of \pi is continuous and \in [0, 1].
The shape of the posterior appears to also have a Beta(\alpha, \beta) model.
If we were to collect more information about Michelle’s support, we would use the current posterior as the new prior, then update our posterior.
We used Michelle’s election support to understand the Beta-Binomial model.
Let’s now generalize it for any appropriate situation.
\begin{align*} Y|\pi &\sim \text{Bin}(n, \pi) \\ \pi &\sim \text{Beta}(\alpha, \beta) \\ \pi | (Y=y) &\sim \text{Beta}(\alpha+y, \beta+n-y) \end{align*}
\pi | (Y=y) \sim \text{Beta}(\alpha+y, \beta+n-y)
\begin{align*} E[\pi | Y = y] &= \frac{\alpha + y}{\alpha + \beta + n} \\ \text{Var}[\pi|Y=y] &= \frac{(\alpha+y)(\beta+n-y)}{(\alpha+\beta+n)^2(\alpha+\beta+1)} \end{align*}
Let’s pause and think about this from a theoretical standpoint.
The Beta distribution is a conjugate prior for the likelihood.
Recall the Beta prior, f(\pi),
L(\pi|y) = {n \choose y} \pi^y (1-\pi)^{n-y}
f(\pi) = \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)} \pi^{\alpha-1}(1-\alpha)^{\beta-1}
\begin{align*} f(\pi|y) &\propto f(\pi)L(\pi|y) \\ &= \frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)} \pi^{\alpha-1}(1-\pi)^{\beta-1} \times {n \choose y} \pi^y (1-\pi)^{n-1} \\ &\propto \pi^{(\alpha+y)-1} (1-\pi)^{(\beta+n-y)-1} \end{align*}
f(\pi|y) = \frac{\Gamma(\alpha+\beta+n)}{\Gamma(\alpha+y) \Gamma(\beta+n-y)} \pi^{(\alpha+y)-1} (1-\pi)^{(\beta+n-y)-1}
In a 1963 issue of The Journal of Abnormal and Social Psychology, Stanley Milgram described a study in which he investigated the propensity of people to obey orders from authority figures, even when those orders may harm other people (Milgram 1963).
Study participants were given the task of testing another participant (who was a trained actor) on their ability to memorize facts.
If the actor didn’t remember a fact, the participant was ordered to administer a shock on the actor and to increase the shock level with every subsequent failure.
Unbeknownst to the participant, the shocks were fake and the actor was only pretending to register pain from the shock.
The parameter of interest here is \pi, the chance that a person would obey authority (in this case, administering the most severe shock), even if it meant bringing harm to others.
The outcome of interest is Y, the number of the n=40 study participants that would inflict the most severe shock.
What model is appropriate?
What model is appropriate?
Assuming that each participant behaves independently of the others, we can model the dependence of Y on \pi using the Binomial.
Thus, we have a Beta-Binomial Bayesian model.
\begin{align*} Y|\pi &\sim \text{Bin}(40, \pi) \\ \pi &\sim \text{Beta}(1, 10) \end{align*}
They don’t have an informed opinion.
They’re fairly certain that a large proportion of people will do what authority tells them.
They’re fairly certain that only a small proportion of people will do what authority tells them.
c. They’re fairly certain that only a small proportion of people will do what authority tells them.
After data collection, Y = 26 of the n=40 study participants inflected what they understood to be the maximum shock.
From the problem set up,
\begin{align*} Y|\pi &\sim \text{Bin}(40, \pi) \\ \pi &\sim \text{Beta}(1, 10) \end{align*}
\pi|(Y=26) \sim \text{Beta}(\text{??}, \text{??})
After data collection, Y = 26 of the n=40 study participants inflected what they understood to be the maximum shock.
Use what you know to find the posterior model of \pi,
\pi|(Y=26) \sim \text{Beta}(\text{??}, \text{??})
\begin{align*} \pi | (Y=y) &\sim \text{Beta}(\alpha+y, \beta+n-y) \\ &\sim \text{Beta}(1+26, 10+40-26) \\ &\sim \text{Beta}(27, 24) \end{align*}
What belief did we have for \pi before considering the data?
What belief do we have for \pi after considering the prior and the observed data?
model alpha beta mean mode var sd
1 prior 1 10 0.09090909 0.0000000 0.006887052 0.08298827
2 posterior 27 24 0.52941176 0.5306122 0.004791057 0.06921746
What belief did we have for \pi before considering the data?
What belief do we have for \pi after considering the prior and the observed data?
\begin{equation*} \begin{aligned} Y|\pi &\sim \text{Bin}(n,\pi) \\ \pi &\sim \text{Beta}(\alpha,\beta) & \end{aligned} \Rightarrow \begin{aligned} && \pi | (Y=y) &\sim \text{Beta}(\alpha+y, \beta+n-y) \\ \end{aligned} \end{equation*}
The prior model, f(\pi), is given by Beta(\alpha,\beta).
The data model, f(Y|\pi), is given by Bin(n,\pi).
The likelihood function, L(\pi|y), is obtained by plugging y into the Binomial pmf.
The posterior model is a Beta distribution with updated parameters \alpha+y and \beta+n-y.