Gamma-Poisson Model

Introduction: Gamma-Poisson Model

  • Recall the Beta-Binomial model,

    • y \sim \text{Bin}(n, \pi) (data distribution)
    • \pi \sim \text{Beta}(\alpha, \beta) (prior distribution)
    • \pi|y \sim \text{Beta}(\alpha+y, \beta+n-y) (posterior distribution)
  • The Beta-Binomial model is from a conjugate family (i.e., the posterior is from the same model family as the prior).

  • Now, we will learn about the Gamma-Poisson, another conjugate family.

Example Set Up

  • Suppose we are now interested in modeling the number of spam calls we receive.

    • This means that we are modeling the rate, \lambda.
  • We take a guess and say that the value of \lambda that is most likely is around 5,

    • … but reasonably ranges between 2 and 7 calls per day.
  • Why can’t we use the Beta distribution as our prior distribution?

    • \lambda is the mean of a count \to \lambda \in \mathbb{R}^+ \to \lambda is not limited to [0, 1] \to broken assumption for Beta distribution.
  • Why can’t we use the binomial distribution as our data distribution?

    • Y_i is a count \to Y_i \in \mathbb{N}^+ \to Y_i is not limited to \{0, 1\} \to broken assumption for Binomial distribution.

Poisson Data Model

  • We will use the Poisson distribution to model the number of spam calls

Y \in \{0, 1, 2, ...\}

  • Y is the number of independent events that occur in a fixed amount of time or space.

  • \lambda > 0 is the rate at which these events occur.

  • Mathematically,

Y | \lambda \sim \text{Pois}(\lambda),

  • with pmf,

f(y|\lambda) = \frac{\lambda^y e^{-\lambda}}{y!}, \ \ \ y \in \{0,1, 2, ... \}

Gamma Prior

  • If \lambda is a continuous random variable that can take on any positive value (\lambda > 0), then the variability may be modeled with the Gamma distribution with
    • shape hyperparameter s>0
    • rate hyperparameter r>0.
  • Thus,

\lambda \sim \text{Gamma}(s, r)

  • and the Gamma pdf is given by

f(\lambda) = \frac{r^s}{\Gamma(s)} \lambda^{s-1} e^{-r\lambda}

Gamma Prior: Shapes

  • Let’s explore the shape of the Gamma:
plot_gamma(1, 1) + theme_bw() + ggtitle("Gamma(1, 1)")

Gamma Prior: Shapes

  • Let’s explore the shape of the Gamma:
plot_gamma(2, 1) + theme_bw() + ggtitle("Gamma(2, 1)")

Gamma Prior: Shapes

  • Let’s explore the shape of the Gamma:
plot_gamma(10, 1) + theme_bw() + ggtitle("Gamma(10, 1)")

Gamma Prior: Shapes

  • Let’s explore the shape of the Gamma:
plot_gamma(10, 10) + theme_bw() + ggtitle("Gamma(10, 10)")

Gamma Prior: Shapes

  • What happens when we increase the \alpha parameter?

Gamma Prior: Shapes

  • What happens when we increase the \beta parameter?

Gamma Prior: Shapes

  • Putting these on the same scale,

Tuning the Gamma Prior

  • Let’s now tune our prior.

  • We are assuming \lambda \approx 5, somewhere between 2 and 7.

  • We know the mean of the gamma distribution,

E(\lambda) = \frac{s}{r} \approx 5 \to 5r \approx s

  • Your turn! Use the plot_gamma() function to figure out what value of s and r we need.

Tuning the Gamma Prior

  • Looking at different values:

Poisson Data Model

  • We will be taking samples from different days.
    • We assume that the daily number of calls may differ from day to day. On each day i,

Y_i|\lambda \sim \text{Pois}(\lambda)

  • This has a unique pmf for each day (i),

f(y_i|\lambda) = \frac{\lambda^{y_i} e^{-\lambda}}{y_i!}

  • But really, we are interested in the joint information in our sample of n observations.
    • The joint pmf gives us this information.

Poisson Data Model

  • The joint pmf for the Poisson,

\begin{align*} f\left(\overset{\to}{y_i}|\lambda\right) &= \prod_{i=1}^n f(y_i|\lambda) \\ &= f(y_1|\lambda) \times f(y_2|\lambda) \times ... \times f(y_n|\lambda) \\ &= \frac{\lambda^{y_1}e^{-\lambda}}{y_1!} \times \frac{\lambda^{y_2}e^{-\lambda}}{y_2!} \times ... \times \frac{\lambda^{y_n}e^{-\lambda}}{y_n!} \\ &= \frac{\left( \lambda^{y_1} \lambda^{y_2} \cdot \cdot \cdot \ \lambda^{y_n} \right) \left( e^{-\lambda} e^{-\lambda} \cdot \cdot \cdot e^{-\lambda}\right)}{y_1! y_2! \cdot \cdot \cdot y_n!} \\ &= \frac{\lambda^{\sum y_i}e^{-n\lambda}}{\prod_{i=1}^n y_i !} \end{align*}

Poisson Data Model

  • If the joint pmf for the Poisson is

f\left(\overset{\to}{y_i}|\lambda\right) = \frac{\lambda^{\sum y_i}e^{-n\lambda}}{\prod_{i=1}^n y_i !}

  • then the likelihood function for \lambda > 0 is

\begin{align*} L\left(\lambda|\overset{\to}{y_i}\right) &= \frac{\lambda^{\sum y_i}e^{-n\lambda}}{\prod_{i=1}^n y_i !} \\ & \propto \lambda^{\sum y_i} e^{-n\lambda} \end{align*}

  • Pease see page 102 in the textbook for full derivations.

Gamma-Poisson Conjugacy

  • Let \lambda > 0 be an unknown rate parameter and (Y_1, Y_2, ... , Y_n) be an independent sample from the Poisson distribution.

  • The Gamma-Poisson Bayesian model is as follows:

\begin{align*} Y_i | \lambda &\overset{ind}\sim \text{Pois}(\lambda) \\ \lambda &\sim \text{Gamma}(s, r) \\ \lambda | \overset{\to}y &\sim \text{Gamma}\left( s + \sum y_i, r + n \right) \end{align*}

  • The proof can be seen in section 5.2.4 of the textbook.

Gamma-Poisson Conjugacy

  • Suppose we use Gamma(10, 2) as the prior for \lambda, the daily rate of calls.

  • On four separate days in the second week of August (i.e., independent days), we received \overset{\to}y = (6, 2, 2, 1) calls.

  • We will use the plot_poisson_likelihood() function:

plot_poisson_likelihood(y = c(6, 2, 2, 1), lambda_upper_bound = 10)
  • Notes:
    • lambda_upper_bound limits the x axis – recall that \lambda \in (0, \infty)!
    • lambda_upper_bound’s default value is 10.

Gamma-Poisson Conjugacy

  • We can see that the average is around 2.75.
mean(c(6, 2, 2, 1))
[1] 2.75

Gamma-Poisson Conjugacy

  • We know our prior distribution is Gamma(10, 2) and the data distribution is Poi(2.75).

  • Thus, the posterior is as follows,

\begin{align*} \lambda | \overset{\to}y &\sim \text{Gamma}\left( s + \sum y_i, r + n \right) \\ &\sim \text{Gamma}\left(10 + 11, 2 + 4 \right) \\ &\sim \text{Gamma}\left(21, 6 \right) \end{align*}

The Gamma Posterior Model

  • Looking at just the prior and the data distributions,

  • The prior expects more spam calls than what we observed.

The Gamma Posterior Model

  • Now including the posterior,

  • The shape of the posterior has a Gamma(s, r) model.

    • The shape and rate parameters (s and r) have been updated.

The Gamma Posterior Model

  • The plot_gamma_poisson() function:
plot_gamma_poisson(shape = prior_s, rate = prior_r, 
                   sum_y = sum_of_obs, n = sample_size, 
                   posterior = TRUE or FALSE) + 
  theme_bw()

The Gamma Posterior Model

  • Your turn! What is different if we had used one of the other priors?

  • Recall, we considered

    • Gamma(5, 1)
    • Gamma(10, 2)
    • Gamma(15, 3)
    • Gamma(20, 4)

The Gamma Posterior Model

  • Your turn! What is different if we had used Gamma(15, 3) as our prior?

The Gamma Posterior Model

  • We can use the summarize_gamma_poisson() function to summarize the distribution,
summarize_gamma_poisson(shape = 10, rate = 2, sum_y = 11, n = 4)

Wrap Up: Gamma-Poisson Model

  • We have built the Gamma-Poisson model for \lambda, an unknown rate.

\begin{equation*} \begin{aligned} Y|\lambda &\sim \text{Poi}(\lambda) \\ \lambda &\sim \text{Gamma}(s, r) & \end{aligned} \Rightarrow \begin{aligned} && \lambda | y &\sim \text{Gamma}(s + \sum y_i, r + n) \\ \end{aligned} \end{equation*}

  • The prior model, f(\lambda), is given by Gamma(s, r).

  • The data model, f(Y|\lambda), is given by Poi(\lambda).

  • The posterior model is a Gamma distribution with updated parameters s+\sum y_i and r + n.

Homework / Practice