As scientists learn more about brain health, the dangers of concussions are gaining greater attention.
We are interested in \mu, the average volume (cm3) of a specific part of the brain: the hippocampus.
Wikipedia tells us that among the general population of human adults, each half of the hippocampus has volume between 3.0 and 3.5 cm3.
We will take a sample of n=25 participants and update our belief.
f(y) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp \left\{ \frac{-(y-\mu)^2}{2\sigma^2} \right\}
Y_i | \mu \sim N(\mu, \sigma^2)
f(\overset{\to}y | \mu) = \prod_{i=1}^n f(y_i | \mu) = \prod_{i=1}^n \frac{1}{\sqrt{2 \pi \sigma^2}} \exp \left\{ \frac{-(y_i-\mu)^2}{2\sigma^2} \right\}
L(\mu|\overset{\to}y) \propto \prod_{i=1}^n \frac{1}{\sqrt{2 \pi \sigma^2}} \exp \left\{ \frac{-(y_i-\mu)^2}{2\sigma^2} \right\} = \exp \left\{ \frac{- \sum_{i=1}^n(y_i-\mu)^2}{2\sigma^2} \right\}
Y_i | \mu \sim N(\mu, \sigma^2)
\mu \sim N(\theta, \tau^2),
f(\mu) = \frac{1}{\sqrt{2 \pi \tau^2}} \exp \left\{ \frac{-(\mu - \theta)^2}{2 \tau^2} \right\}
We can tune the hyperparameters \theta and \tau to reflect our understanding and uncertainty about the average hippocampal volume (\mu) among people with a history of concussions.
Wikipedia showed us that hippocampal volumes tend to be between 6 and 7 cm3 \to \theta=6.5.
When we set the standard deviation we can check the plausible range of values of \mu:
\theta \pm 2 \times \tau
(6.5 \pm 2 \times 0.4) = (5.7, 7.3)
Let \mu \in \mathbb{R} be an unknown mean parameter and (Y_1, Y_2, ..., Y_n) be an independent N(\mu, \sigma^2) sample where \sigma is assumed to be known.
The Normal-Normal Bayesian model is as follows:
\begin{align*} Y_i | \mu &\overset{\text{iid}} \sim N(\mu, \sigma^2) \\ \mu &\sim N(\theta, \tau^2) \\ \mu | \overset{\to}y &\sim N\left( \theta \frac{\sigma^2}{n\tau^2 + \sigma^2} + \bar{y} \frac{n\tau^2}{n\tau^2 + \sigma^2}, \frac{\tau^2 \sigma^2}{n \tau^2 + \sigma^2} \right) \end{align*}
\mu | \overset{\to}y \sim N\left( \theta \frac{\sigma^2}{n\tau^2 + \sigma^2} + \bar{y} \frac{n\tau^2}{n\tau^2 + \sigma^2}, \frac{\tau^2 \sigma^2}{n \tau^2 + \sigma^2} \right)
\mu | \overset{\to}y \sim N\left( \theta \frac{\sigma^2}{n\tau^2 + \sigma^2} + \bar{y} \frac{n\tau^2}{n\tau^2 + \sigma^2}, \frac{\tau^2 \sigma^2}{n \tau^2 + \sigma^2} \right)
\begin{align*} \frac{\sigma^2}{n\tau^2 + \sigma^2} &\to 0 \\ \frac{n\tau^2}{n\tau^2 + \sigma^2} &\to 1 \\ \frac{\tau^2 \sigma^2}{n \tau^2 + \sigma^2} &\to 0 \end{align*}
\begin{align*} \frac{\sigma^2}{n\tau^2 + \sigma^2} &\to 0 \\ \frac{n\tau^2}{n\tau^2 + \sigma^2} &\to 1 \\ \frac{\tau^2 \sigma^2}{n \tau^2 + \sigma^2} &\to 0 \end{align*}
The posterior mean places less weight on the prior mean and more weight on the sample mean \bar{y}.
The posterior certainty about \mu increases and becomes more in sync with the data.
Let us now apply this to our example.
We have our prior model, \mu \sim N(6.5, 0.4^2).
Let’s look at the football dataset in the bayesrules package.
Let us now apply this to our example.
We have our prior model, \mu \sim N(6.5, 0.4^2).
Let’s look at the football dataset in the bayesrules package.
L(\mu|\overset{\to}y) \propto \exp \left\{ \frac{-(5.735 - \mu)^2}{2(0.5^2/25)} \right\}
\mu | \overset{\to}y \sim N\left( \theta \frac{\sigma^2}{n\tau^2 + \sigma^2} + \bar{y} \frac{n\tau^2}{n\tau^2 + \sigma^2}, \frac{\tau^2 \sigma^2}{n \tau^2 + \sigma^2} \right)
\begin{align*} \mu | \overset{\to}y &\sim N\left( \theta \frac{\sigma^2}{n\tau^2 + \sigma^2} + \bar{y} \frac{n\tau^2}{n\tau^2 + \sigma^2}, \frac{\tau^2 \sigma^2}{n \tau^2 + \sigma^2} \right) \\ &\sim N\left( 6.5 \frac{0.5^2}{25 \cdot 0.4^2 + 0.5^2} + 5.735 \frac{25 \cdot 0.4^2}{25 \cdot 0.4^2 + 0.5^2}, \frac{0.4^2 \cdot 0.5^2}{25 \cdot 0.4^2 + 0.5^2} \right) \\ &\sim N(6.5 \cdot 0.0588 + 5.737 \cdot 0.9412, 0.09^2) \\ &\sim N(5.78, 0.09^2) \end{align*}
summarize_normal_normal() function to summarize the distribution,\begin{equation*} \begin{aligned} Y_i | \mu &\overset{\text{iid}} \sim N(\mu, \sigma^2) \\ \mu &\sim N(\theta, \tau^2) & \end{aligned} \Rightarrow \begin{aligned} && \mu | \overset{\to}y &\sim N\left( \theta \frac{\sigma^2}{n\tau^2 + \sigma^2} + \bar{y} \frac{n\tau^2}{n\tau^2 + \sigma^2}, \frac{\tau^2 \sigma^2}{n \tau^2 + \sigma^2} \right) \\ \end{aligned} \end{equation*}
The prior model, f(\mu), is given by N(\theta,\tau^2).
The data model, f(Y|\mu), is given by N(\mu, \sigma^2).
The posterior model is a Normal distribution with updated parameters
This week we have learned the other two conjugate families.
While we are not forced to analyze our data using conjugate families, our lives are much easier when we can use the known relationships.
Now that we know how to specify the posterior distributions, we can focus on moving forward with drawing conclusions about the posterior distribution.
From the Bayes Rules! textbook:
STA6349 - Applied Bayesian Analysis - Fall 2025