Poisson Regression

Introduction

We have previously discussed continuous outcomes:
- y \sim N(\mu, \sigma^2)
- y \sim \text{gamma}(\mu, \gamma)
- y \sim \text{beta}(\alpha, \beta)
- y \sim \text{binomial}(n, \pi)
- y \sim \text{multinomial}(n, [\pi_1, ... \pi_c])
This week, we will consider count outcomes:
- Poisson
- Negative binomial

Poisson Regression

We can model count outcomes using Poisson regression,

\ln\left( y \right) = \beta_0 + \beta_1 x_1 + ... + \beta_k x_k

This is similar to gamma regression with \ln().
- For interpretations, we are not interested in the natural log of the count.
- We will interpret in terms of incident rate ratios (IRR) instead of multiplicative effects on the mean.

Poisson Regression

Poisson regression is used specifically for count response variables.
- Examples:
  - Number of votes received
  - Number of attempts before success
How is this different than gamma regression?
- Gamma regression is used for continuous response variables that are positive and right-skewed.
  - Support is (0, \infty).
- Poisson regression is used for count response variables that are non-negative integers.
  - Support is \{0, 1, 2, ..., \infty\}.

Poisson Distribution

Lecture Example Set Up

Pluto spends his days at Mickey’s park chasing squirrels and interacting with guests. Disney researchers are interested in understanding what factors influence the number of squirrels Pluto chases per hour.
For 300 observation periods, researchers recorded:
- squirrels_chased: Number of squirrels chased during the observation period
- temperature: Temperature (°F)
- crowd: Park crowd level (guests per 100 people)
- mickey_present: Whether Mickey is present (Yes/No)
- time_of_day: Time of day (Morning, Afternoon, Evening)

Lecture Example Set Up

Pulling in the data,

pluto <- read_csv("https://raw.githubusercontent.com/samanthaseals/SDSII/refs/heads/main/files/data/lectures/W6_pluto.csv")
pluto %>% head()

Lecture Example

Let’s look at the outcome variable, squirrels_chased.

Poisson Regression (R)

We are able to again use the glm() function when specifying the Poisson distribution.

m <- glm(y ~ x_1 + x_2 + ... + x_k,
         data = dataset_name,
         family = poisson(link = "log"))

Everything we have learned about how to use model results applies here as well.

Example 1: Poisson Regression

Let’s model the number of squirrels chased as a function of temperature, crowd, and their interaction.

m1 <- glm(squirrels_chased ~ temperature + crowd + temperature:crowd,
                 data = pluto,
                 family = poisson(link = "log"))
m1 %>% tidy()

Example 1: Poisson Regression

The resulting regresion model is,

m1 %>% coefficients()

      (Intercept)       temperature             crowd temperature:crowd 
     0.1857624458      0.0363002481      0.0393306623     -0.0003112087

\ln(\hat{y}) = 0.19 + 0.04 \text{ temp} + 0.04 \text{ crowd} - 0.0003 \text{ temp} \times \text{crowd}

Interpreting the Model

Recall the Poisson regression model,

\ln\left( y \right) = \beta_0 + \beta_1 x_1 + ... + \beta_k x_k

We are modeling the log of the count.
- Interpreting \hat{\beta}_i is an additive effect on the count.
- Interpreting e^{\hat{\beta}_i} is a multiplicative effect on the count.
Thus, when interpreting the slope for x_i, we will look at the Incident Rate Ratio, e^{\hat{\beta}_i}.

Incident Rate Ratios (R)

We can request the incident rate ratios directly out of tidy(),

model %>% tidy(conf.int = TRUE, exponentiate = TRUE)

Example 1: Interpreting the Model

The incident rate ratios for the model are,

m1 %>% tidy(conf.int = TRUE, exponentiate = TRUE)

Example 1: Interpreting the Model

Welp, we have an interaction in our model.
This means that we have to:
- plug in values for crowd to interpret the slope for temperature,
- plug in values for tempreature to intepret the slope for crowd.
Let’s look at summary statistics for each:

summary(pluto$temperature)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  51.00   68.00   74.50   74.42   80.00  100.00

summary(pluto$crowd)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.30   39.23   48.70   49.00   58.75   91.20

Let’s interpret crowd when temperature = 65, 75, and 85.
Let’s interpret temperature when crowd = 40, 50, and 60.

Example 1: Interpreting the Model

To interpret the slope for crowd, we will plug in values for temperature into the model,

Temperature	Crowd’s Slope	Crowd’s IRR
65	0.04 + (-0.0003 * 65) = 0.0205	e^{0.0205} = 1.021
75	0.04 + (-0.0003 * 75) = 0.0175	e^{0.0175} = 1.017
85	0.04 + (-0.0003 * 85) = 0.0145	e^{0.0145} = 1.015

As temperature increases, crowd’s effect on the number of squirrels chased decreases.
- When temperature is 65, for every 1 unit increase in crowd, the number of squirrels chased increases by a factor of 1.021; this is a 2.1% increase.
- When temperature is 75, for every 1 unit increase in crowd, the number of squirrels chased increases by a factor of 1.017; this is a 1.7% increase.
- When temperature is 85, for every 1 unit increase in crowd, the number of squirrels chased increases by a factor of 1.015; this is a 1.5% increase.

Example 1: Interpreting the Model

To interpret the slope for temperature, we will plug in values for crowd into the model,

Crowd	Temperature’s Slope	Temperature’s IRR
40	0.04 - (0.003 * 40) = - 0.08	e^{-0.08} = 0.92
50	0.04 - (0.003 * 50) = - 0.11	e^{-0.11} = 0.90
60	0.04 - (0.003 * 60) = - 0.14	e^{-0.14} = 0.87

As temperature increases, crowd’s effect on the number of squirrels chased decreases.
- When the crowd level is 40, for every 1 degree increase in temperature, the number of squirrels chased decreases by a factor of 0.92; this is an 8% decrease.
- When the crowd level is 50, for every 1 degree increase in temperature, the number of squirrels chased decreases by a factor of 0.90; this is a 10% decrease.
- When the crowd level is 60, for every 1 degree increase in temperature, the number of squirrels chased decreases by a factor of 0.87; this is a 13% decrease.

Statistical Inference

Our approach to determining significance remains the same.
- For continuous and binary predictors, we can use the Wald test (z-test from tidy()).
- For omnibus tests, we can use the likelihood ratio test (LRT).
  - Significance of regression line (full/reduced LRT).
  - Significance of categorical predictors with c \ge 3 levels (car::Anova(type = 3) or full/reduced LRT).
  - Interaction terms with categorical predictors with c \ge 3 levels (car::Anova(type = 3) or full/reduced LRT).

Example 1: Significant Regression Line

Let’s determine if this is a significant regression line.

m1_full <- glm(squirrels_chased ~ temperature + crowd + temperature:crowd, data = pluto, family = poisson(link = "log"))
m1_reduced <- glm(squirrels_chased ~ 1, data = pluto, family = poisson(link = "log"))
anova(m1_reduced, m1_full, test = "LRT")

Yes, this is a significant regression line (p < 0.001).

Example 1: Testing the Interaction

Let’s now determine if the interaction is significant,

m1 %>% car::Anova(type = 3)

Yes, the interaction between temperature and crowd is significant (p < 0.001).
Note! The interaction is significant and there are no “plain” main effects in our model. Our formal inference stops here.

Example 2: Poisson Regression

Let’s now model the number of squirrels chased as a function of temperature, if Mickey is present, and what time of day.

m2 <- glm(squirrels_chased ~ temperature + mickey_present + time_of_day + temperature*mickey_present*time_of_day,
                 data = pluto,
                 family = poisson(link = "log"))
m2 %>% tidy()

Example 2: Poisson Regression

Oh no! A three-way interaction! Let’s check significance:

m2 %>% car::Anova(type = 3)

Welp.

Example 2: Poisson Regression

Because the three-way interaction is significant, it means that:
- The interaction between temperature and mickey_present depends on time_of_day.
- The interaction between temperature and time_of_day depends on mickey_present.
- The interaction between mickey_present and time_of_day depends on temperature.
Because the three-way interaction is significant, we will now literally perform stratified analysis. We could:
- Stratify by temperature
- Stratify by time_of_day
- Stratify by mickey_present

Example 2: Poisson Regression

When Mickey is present:

m2_mickey <- pluto %>% 
  filter(mickey_present == "Yes") %>%
  glm(squirrels_chased ~ temperature + time_of_day + temperature*time_of_day,
                 data = .,
                 family = poisson(link = "log"))
m2_mickey %>% tidy()

Example 2: Poisson Regression

When Mickey is not present:

m2_no_mickey <- pluto %>% 
  filter(mickey_present == "No") %>%
  glm(squirrels_chased ~ temperature + time_of_day + temperature*time_of_day,
                 data = .,
                 family = poisson(link = "log"))
m2_no_mickey %>% tidy()

Example 2: Poisson Regression

Putting the results side-by-side:

Term	Mickey Present	Mickey Not Present
Intercept	2.37	3.09
Temperature	0.02 (p < 0.001)	0.009 (p < 0.001)
Time of Day (Evening)	-0.41 (p = 0.087)	-1.44 (p < 0.001)
Time of Day (Morning)	0.41 (p = 0.111)	-1.69 (p < 0.001)
Temp x Evening	-0.001 (p = 0.661)	0.01 (p = 0.005)
Temp x Morning	- 0.01 (p = 0.003)	0.02 (p < 0.001)

How different are the results?

Example 2: Poisson Regression

Converting to IRRs,

Term	Mickey Present	Mickey Not Present
Temperature	1.02	1.01
Time of Day (Evening)	0.67	0.24
Time of Day (Morning)	1.51	0.18
Temp x Evening	0.9986	1.01
Temp x Morning	0.9899	1.02

How different are the results?

Example 2: Poisson Regression

Remember that when the three-way interaction is significant, we know that the two-way interactions are also significant.

Term	Mickey Present	Mickey Not Present
Temperature	1.02	1.01
Time of Day (Evening)	0.67	0.24
Time of Day (Morning)	1.51	0.18
Temp x Evening	0.9986	1.01
Temp x Morning	0.9899	1.02

The interactions involved with Mickey’s presence have been absorbed into the analyses – we “see” them in the differing slopes for the main effects of time of day and temperature.

Example 2: Significant Regression Line(s)

Let’s now determine if these are significant regression lines.

m2_mickey_full <- pluto %>% filter(mickey_present == "Yes") %>% glm(squirrels_chased ~ temperature + time_of_day + temperature*time_of_day, data = ., family = poisson(link = "log"))
m2_mickey_reduced <- pluto %>% filter(mickey_present == "Yes") %>% glm(squirrels_chased ~ 1, data = ., family = poisson(link = "log"))
anova(m2_mickey_reduced, m2_mickey_full, test = "LRT")

m2_no_mickey_full <- pluto %>% filter(mickey_present == "No") %>% glm(squirrels_chased ~ temperature + time_of_day + temperature*time_of_day, data = ., family = poisson(link = "log"))
m2_no_mickey_reduced <- pluto %>% filter(mickey_present == "No") %>% glm(squirrels_chased ~ 1, data = ., family = poisson(link = "log"))
anova(m2_no_mickey_reduced, m2_no_mickey_full, test = "LRT")

They are both significant regression lines (both p < 0.001).

Example 2: Significant Predictors

Let’s now determine significance of individual predictors,

m2_mickey %>% car::Anova(type = 3)

m2_no_mickey %>% car::Anova(type = 3)

In both models the interaction between temperature and time of day is a significant predictor (p = 0.006 and p < 0.001).

Example 2: Interpreting the Model

Like before, we can further stratify in the interest of interpretation. Let’s first break down the model with Mickey present,

\ln(\hat{y}) = 2.37 + 0.04 \text{ temp} - 0.41 \text{ E} + 0.41 \text{ M} - 0.001 \text{ temp} \times \text{E} - 0.01 \text{ temp} \times \text{M}

Time of Day	Temperature’s Slope	Temperature’s IRR
Morning	0.04 - 0.01 = 0.03	e^{0.03} = 1.030
Afternoon	0.04	e^{0.04} = 1.041
Evening	0.04 - 0.001 = 0.039	e^{0.039} = 1.039

Example 2: Interpreting the Model

Like before, we can further stratify in the interest of interpretation. Let’s first break down the model with Mickey not present,

ln(\hat{y}) = 3.09 + 0.009 \text{ temp} - 1.44 \text{ E} - 1.69 \text{ M} + 0.01 \text{ temp} \times \text{E} + 0.02 \text{ temp} \times \text{M}

Time of Day	Temperature’s Slope	Temperature’s IRR
Morning	0.009 - 1.69 = -1.681	e^{-1.681} = 0.186
Afternoon	0.009	e^{0.009} = 2.832
Evening	0.009 - 1.44 = -1.431	e^{-1.431} = 0.239

Example 2: Interpreting the Model

Finally, putting these results together side-by-side,

Time of Day	IRR for Temp - Mickey	IRR for Temp - No Mickey
Morning	1.030	0.186
Afternoon	1.041	2.832
Evening	1.039	0.239

Pluto is much more likely to chase squirrels when Mickey is not present in the afternoon.
If Mickey is not around, Pluto is much less likely to chase squirrels in the morning and evening as temperature increases.
If Mickey is around, Pluto is slightly more likely to chase squirrels as temperature increases – but time of day does not affect this.

Wrap Up

In this lecture, we have introduced Poisson regression.
- Poisson regression is appropriate for count outcomes.
We again saw the log link function.
- We interpret coefficients as multiplicative effects on the count of the outcome.
- These are called the incident rate ratios (IRR).
In the next lecture, we will review negative binomial regression.