Recall the Beta-Binomial model,
The Beta-Binomial model is from a conjugate family (i.e., the posterior is from the same model family as the prior).
Now, we will learn about the Gamma-Poisson, another conjugate family.
Suppose we are now interested in modeling the number of spam calls we receive.
We take a guess and say that the value of \lambda that is most likely is around 5,
Why can’t we use the Beta distribution as our prior distribution?
Why can’t we use the binomial distribution as our data distribution?
Y \in \{0, 1, 2, ...\}
Y is the number of independent events that occur in a fixed amount of time or space.
\lambda > 0 is the rate at which these events occur.
Mathematically,
Y | \lambda \sim \text{Pois}(\lambda),
f(y|\lambda) = \frac{\lambda^y e^{-\lambda}}{y!}, \ \ \ y \in \{0,1, 2, ... \}
\lambda \sim \text{Gamma}(s, r)
f(\lambda) = \frac{r^s}{\Gamma(s)} \lambda^{s-1} e^{-r\lambda}
Let’s now tune our prior.
We are assuming \lambda \approx 5, somewhere between 2 and 7.
We know the mean of the gamma distribution,
E(\lambda) = \frac{s}{r} \approx 5 \to 5r \approx s
plot_gamma() function to figure out what value of s and r we need.Y_i|\lambda \sim \text{Pois}(\lambda)
f(y_i|\lambda) = \frac{\lambda^{y_i} e^{-\lambda}}{y_i!}
\begin{align*} f\left(\overset{\to}{y_i}|\lambda\right) &= \prod_{i=1}^n f(y_i|\lambda) \\ &= f(y_1|\lambda) \times f(y_2|\lambda) \times ... \times f(y_n|\lambda) \\ &= \frac{\lambda^{y_1}e^{-\lambda}}{y_1!} \times \frac{\lambda^{y_2}e^{-\lambda}}{y_2!} \times ... \times \frac{\lambda^{y_n}e^{-\lambda}}{y_n!} \\ &= \frac{\left( \lambda^{y_1} \lambda^{y_2} \cdot \cdot \cdot \ \lambda^{y_n} \right) \left( e^{-\lambda} e^{-\lambda} \cdot \cdot \cdot e^{-\lambda}\right)}{y_1! y_2! \cdot \cdot \cdot y_n!} \\ &= \frac{\lambda^{\sum y_i}e^{-n\lambda}}{\prod_{i=1}^n y_i !} \end{align*}
f\left(\overset{\to}{y_i}|\lambda\right) = \frac{\lambda^{\sum y_i}e^{-n\lambda}}{\prod_{i=1}^n y_i !}
\begin{align*} L\left(\lambda|\overset{\to}{y_i}\right) &= \frac{\lambda^{\sum y_i}e^{-n\lambda}}{\prod_{i=1}^n y_i !} \\ & \propto \lambda^{\sum y_i} e^{-n\lambda} \end{align*}
Let \lambda > 0 be an unknown rate parameter and (Y_1, Y_2, ... , Y_n) be an independent sample from the Poisson distribution.
The Gamma-Poisson Bayesian model is as follows:
\begin{align*} Y_i | \lambda &\overset{ind}\sim \text{Pois}(\lambda) \\ \lambda &\sim \text{Gamma}(s, r) \\ \lambda | \overset{\to}y &\sim \text{Gamma}\left( s + \sum y_i, r + n \right) \end{align*}
Suppose we use Gamma(10, 2) as the prior for \lambda, the daily rate of calls.
On four separate days in the second week of August (i.e., independent days), we received \overset{\to}y = (6, 2, 2, 1) calls.
We will use the plot_poisson_likelihood() function:
lambda_upper_bound limits the x axis – recall that \lambda \in (0, \infty)!lambda_upper_bound’s default value is 10.We know our prior distribution is Gamma(10, 2) and the data distribution is Poi(2.75).
Thus, the posterior is as follows,
\begin{align*} \lambda | \overset{\to}y &\sim \text{Gamma}\left( s + \sum y_i, r + n \right) \\ &\sim \text{Gamma}\left(10 + 11, 2 + 4 \right) \\ &\sim \text{Gamma}\left(21, 6 \right) \end{align*}
The shape of the posterior has a Gamma(s, r) model.
plot_gamma_poisson() function:Your turn! What is different if we had used one of the other priors?
Recall, we considered
summarize_gamma_poisson() function to summarize the distribution,\begin{equation*} \begin{aligned} Y|\lambda &\sim \text{Poi}(\lambda) \\ \lambda &\sim \text{Gamma}(s, r) & \end{aligned} \Rightarrow \begin{aligned} && \lambda | y &\sim \text{Gamma}(s + \sum y_i, r + n) \\ \end{aligned} \end{equation*}
The prior model, f(\lambda), is given by Gamma(s, r).
The data model, f(Y|\lambda), is given by Poi(\lambda).
The posterior model is a Gamma distribution with updated parameters s+\sum y_i and r + n.
From the Bayes Rules! textbook:
STA6349 - Applied Bayesian Analysis - Fall 2025